DOCUMENT RESUME 



ED 243 891 



TM 820 280 



AUTHOR 
TITLE 

INSTITUTION 



Ed* 



SPONS AGENCY 



for 
of 



PUB DATE 
GRANT " 
NOTE 
PUB TYPE 



EBRS'PRICE 
DESCRIPTORS 



IDENTIFIERS 



ABSTRACT 



Schwartz , Jiidah L. , Ed. ; Garet, Michael S., 
Assessment in the Service of Instruction. 
Massachusetts Inst . y£ Tech. , Cambridge. Di v. 
Study and Research in Education, ' 
Ford Foundation, New York, N. Y. ; National Inst, 
_Educa4:-i<H^(XB^^^ — 

Jan 81 -v.. * ; . ' > 

G-79-0045 V > 

i&7p. '_' ;,Vfc i"^y;^ 

Collected Works Conference Proceedings ,(&2lK — 
Viewpoints (120) ^ '* v 

MF01/PC07 Plus Postage, ■■[/.[.' f v ; * " 

Criterion Referenced Tests; Educational Assessment;. 
Educational Diagnosis; Elementary Secondary 
Education; Instruction; * Instructional Improvement; * 
* Instructional Materials; Learning Processes; . y'v 
Measurement Techniques; Standardized Tests; Testing; 
*Tests; *Test Use; Test Validity 
*Alternatives to Standardized Testing 



. I ■ In an effort to examine issues raised by the effort 

to assess the performance of educational institutions, a project 
focusing on the social purposes and intellectual foundations 'of 
assessment practices in education- was* initiated. The primary goal of 
the project was to explore the possibility of developing new, more 
appropriate educational ass 

project , several panels were convened, each focusing on a broad _ 
purpose o£ educational assessment. This (document is the report of the 
first panel, which focused on the role of assessment in classroom 
instruction. -It Includes papers by Eva 2 b. Baker , . Eugenia Kemble, 1 
Philip Jacks$n , David Hawkins^, J. Parker Damon, Asd Gi Milliard III, 
Howard E. G ruber, Robert T. Keegan , Judah L.\Schwartz , 'Edwin F. \ 
'Taylor, Nanqy Willie,, and Michael S. Garet. The panel concludes with 
four recommendations: (1) in developing new assessment materials, it 
is worth starting small; (2) the development of new assessment 
materials should be carried out by groups with a strong interest in 
the content areas being assessed; (3) schools interested in adopting 
new forms df assessment should begin by focusing on a sma^. I number of 
classrooms and subject areas; and (4) making hew forms of assessment 
worjc in practice will depend on the sensitivity and ingenuity of 
teachers i (BW) 



/ 



************************************** 

* Reproductions supplied By EDRS are the best that can be made * 

* lJ_z_'_ Z^l * __?rom _the- original document. _ _ *^ _ * 

*********************************************************************** 



ERLC 



oo 



tfcj 




ASSESSNElM 




'^d||ted J by 



■'&'<%*r- ■■•<-.- v •<:• 

f < ^ - : fc JUD^#,. SCHWARTZ : 



U^S^DEPARTNIENT OF EDUCATION 
_.NATJQNAE INSTITUTE OF EOUCATipiy 
EDUCATIONAL RESOURCES INFORMATION - 
— - _ CENTER (EPIC) - ~ 

This doeurnent has been reproduced as 

re ^ ,ved from tne Pe«on _pr organization * 
originating it "■■ ^5 G 

P Minor chanjges have been made to improve 
reproduction quality . 

• Poin ts of View or opinions stated in this docu- 
ment db riot necessarify represent official NIE" 
~P ositionQr ppflr y . ~ 



a 



M^safc&usetts Institute of 




.0 

ERIC, 



the report of a study panel ,to 



THE "FORD FOUNDATION and 
THE NATIONAL INSTITUTE OF. EDUCATION 
.-JANUARY- 1981 : ■', ' . -. '.. ' ■ 



WORKING DRAFT 
copyright 1981 



- HIT 



'2 



TABLE OF CCNTENTS ' 



JL 



ERIC 



- IMROIXJCTION. * si v,^.'iVi,^Srafi L. Scfiwartz & Michael S. Caret 

TESTS & INSTRUCTION; AN HISTORI^^^^'. 1 . * ..... .... ..Eva L. Baker 

TEACHERS NEEDS FOB SffOR^TK^ U r. . i . . ;Y;^.;~V; ; Eiigenia Kemble 

v-v-^rt- 'Assessment 2ma-:lftstrafc¥|«e;- WhW mght Be> '\V 

- .:.:•__::„:.;•...'::•: ; r:-^-Air ; fc; . 

THE UNCERTAINTIES OF TEACHING.. . ... . . . ........... ,i ...Philip Jackson 

THE UNPREDICTABLE NATURE OF LEARNING. ; .. ...... . . . .. -. i^.. .Wid Hawkins 

Part III. What To Do and What To Avoid Doing • t . 

INVESTIGATIVE TEACHING. J, Barker Damcn_ 

CULTURAL VARIATION & LIVING ASSESSMENT. Asa G. Hilliard III 

FCSTERIN3 INTELLESCIUJ^^V^L^^ffiNTi i';; Howard Ei Gruber & Robert T. Keegan. 

' ■ . , Part IV. 'Some Steps Now Underway . 

• . •'' - : ■'. . v _' i ■ , - : . '. .." 

CATEGORICAL TEST VALIDATION; ' : 1 * • v 

THE TORQUE APPROACH ,.;..J:L. Schw'arter E. F. Taylor & N. Willie 



4 



Part V. '■' v.-:;.' • '" .':.V : ; ' i- »* 



CONCLUSION & RECOMMENDATIONS... . .. . ,. . . . .Judah L. Schwartz & Michael S. Garet ^ 



0 / 



v 3 • * # 



X 



Programs designed to assess .student achievement and the performance of 
educational institutions have recome widespread in American education. Bit as' 
assessment programs have multiplied, doubts about the|Lr effectiveness^ have 
grown. Serious questions have been raised abbut™the methodology* ; of/, 
standardized educational testing. Disagreement ' has mounted over the 
interpretation of test scores. B And controversy has afiserr over the ways : ik 
which tests* are used pn schools and universities. 

4 School boards f departments of education, legislatures, /auid the counts, 
have alt been involved in the debate over testing policy >and practice. Large 
numbers ©f school districts, have implemented testing programs designed' to 
diagnose student progress in the. basic skills. Some states have enacted/ laws 
• requiring that students pass "minimum competency tests" for promotion f ran/grade 
\ to grade. Other states have enacted, laws requi ring that students < achieve a 
minimum t test scqre in ordfer to receive a high school diploma. ■ / 

As of this wriEIng,- the legislature!^ 
in Testing" law requiring that all standardized tests used in the state for 
university admissions decisions -be* made public after' they are acMnistered, x arid 
siijilar laws; are being considered by other states and the United States 
Congress. At the same timet a U.S. Court in one state has ordered a.moritorium 
on the state 1 s minimum competency testing program. And 'a CI*S. Court in another 
state has ordered tfrat IQ tests no: longer be used to pl&ce children in. classes 
for the educable mentally retarded! / 

Altogether, the effort to assess 'the performance ' of 'educational 
: institutiohs has raised a number of difficult issues I . . • 

* What roles should assessment programs play in educational policy « 
- 1 . x ■ and practice? \ v ^ ' ■ 

. ■ _ . . " ' / • ■' - . "\ ) ' \ . •- ■■ • . 

$v * whit kinds of assesanent materials are appropriate for these roles? 

v v . \ * What should be expected of educational tests, when they are used? 

• v-y, .''* What shpuld be taken into account f in distinguishing appropriate ; 
. ^' and inappropriate uses of educational assessment? - .'V ' •'' 

. . ^ _ iA an effort to examine these questions,* the DivisiOT for Study and 

Reseach and Education at M.I.T. , with the support of the Ford Foundation arid the > * 
- ^National Institute of Education, has initiated a project focusing on the social 
a purposes and intellectual foundations of assessment practices, in education. Hie 
1 primary goal of the project is to explore .mp possibility of developing new, % - 
more appropriate educational assessment' strategies; 

; There is , a growing critical literature on the role of educational 
testing in, the schools in currently 

available standardized tests. But little has emerged on alternatives to present / 
practice. Thus, the ?im of our, project is a synthetic one; to search for 

S • " . • . . ' -A . * 



ERJC 



positive guidelines for new . approaches to eciacatibnal assessment, 

v V >One 'of the main agstfii^^ tfiat th^re are a 

number of distinct social purpo.^^ epeational ^ses^ehtj :is expected to s§rve r 
and : each of theSe purposes may Require somewhat different .assessment methods, 



instruments, and practices • , First, scl^ls; conduct assessments to obtain : ; 
■ - f Ge^back on student^ progress /so tgachi ng methods and : mater ials can be adjusted "., -Pf 

* appropriately • Secbnd f education^ institutions - conduct assessments as the '% 
' basis for reports' to parents , school : boa rds , \ and government agencies, as a ^ • 

• fe bf promoting ac^oiintabilit^. Finally, , schools and universities ,6oiS3Ui^^(^ 
assessments to provide information on whether^, a student has '.S^stere^i; aL JxSct^ 
knowledge or skills (for purposes of' awarding a dij&qma or,;-license)y or ' : t^^;|J 

• provide information on 'hew weH a-stude^ 

' of selecting applicants for colleges and professional schools). T| . Ti 



Generally, these distinct social \ purposes h^ve all been addressed ' . 
using the same sorts of assessment instruments! -There is little a priori reason 
to believe, hewever, that instruments designed to serve one of these r pur|x)ses 
are equally suited to serve the others. Indeed, it seems more; likely ' that the 
opposite is true. .Cori^uently, we "have organized our project by examining eadh 
of several of the social' purposes of assessment in turn and asking ^what types of ; 
instruments and practices might best serve each if the conistraints of ; v present:,/ 
practice, traditicai and vested interest were absent. / &C:: 7 ^'- 



As part of our project, we have convened ^veral panels, each focusing 
on one of these broad purposes of educational assessment, 2his docum^t: is the 
report ofc the first panel,, focusing on the role of assessment in classroom 
instruction, ■ ■ : :\ " . I ; : ; . \. 

/ In forming the panel on instruction, we brought together people with 

diverse perspectives on education, and asked them to * think broadly about the 
role educational assessment might' play in the teaching and learning process. 
The members of the panel were: . I ' . ' ; | 

I Eva Baker (Director, Center for the Study of Evaluation, University of 
California at LosjAngeles) * . , • ^ i " < 

, \ ■' . . ■ ' V , ■ ' . . ' , . ' : 

J. Barker Damon (Principal, Mc^rthy^Towrie School', Acton, 
Massachusetts): - ; . . ' ; 

i Howard Gruber (Professor of Psychology an<p Di rector , Institute for 

Cognitive Studies, Rutgers University) 

Walt Haney (Senior Research Associate, The Huron institute) / 

bavid Hawkins (Professor of Fhil6so0iy f! University of Colorado) 

Asa Hilliard III (Dean, Schol of Education, San Francisco Stat^ 
University) • ' ;■' : : . •> • 

■ X- - -_ - ' > . --' . . • r • ■' ;/ 

Philip Jackson (Professor of Education, University of ^jhicago) , .' / 

Robert Keegan (Rutgers University) - . • " .y' : _. 

Eugehia - Itemble v (fecial Assistant to the Rcesiderit, American 



-■*■'?/■:' 



ERIC 





E^ratibii jof "teachers) v ; ^ ' ! • ; 

Carmen Perez, New York State 'Department of ' Education 
Edwin Taylor (Education aevelopment Center) i ' 



Sheldon White/ (Prof essor of 



> Harvard University) 



Nancy Willie (Education Development Center) • . ; » 

•; ■ ■ ' ; s • ,: 'r* ' ! _ ■■ ■ •• , : '.:•;■ •/- ■ 

. Jerrold Zacharias (Professor Oneritus, Massachusetts Institute of 

: , ■■.':>■* 1 Technology) . :: 7-5 •'•'.'?' j : : ' •• >• . '_" •. : "' -\ 

\. ' ''.J" ' : " " . __ \ : . '.. ■ \ \ _i 

The Panel on Assessment and Classroom instruction met for thef irst 
time in March of 1979.. Over the next year, members of, the Panel prepared 
, voutlines, comments, and .draft papers, which were circulated and discussed at a 
/second Panel meeting held in February of 1980, «^he Panel completed its work in 
I, /June of 1980. • ."; /• ," 



This document while reflecting the panels view§ is woven together with 
only, the lightest of threads. The individual authors are in no way to 
be held responsible for the coherence the editors have not been able 
to make sufficiently explicit. . 



We wish to thank Kewis Pike of the National Institute of Education 
Marjorie Martus of the Ford; Foundation for the encouragement they have offered 
us in this work. 



It. is the editors' pleasure to^acknowledge the assistance and good 
toumor of Ligia Domingo in the preparation of the manuscript. - 

Judah L. Schwartz , . 

Michael S. Garet * 



ridge, Mass, 
Stanford,' Calif.' 
January 1981 



)/ / 



9 

ERIC 



» _ * 



CHAPTER 1 
INTOXJCTOT 



* Since the turn of the century, ^^e^cators have hoped that the^prkctice 

:. of classroom teachi.ig might be reformed through the use ok standardized " 
educational tests. As the discipline of ^psychological measurement took form in 
the. first fw deco:les o£ the twentieth century, practitioners ; believed that 
achievement tests raic it have a significant and beneficial effect on teaching* 
Tests might help teacners make more objective judgments about student progress, 

• They might offer diagnostic information^on student learning problems.," And they 
; might assist teachers in devising and evaluating instructional strategies, . 

1 There is«v growing doubt, however , < about whether conventional 

standardized tests have provided much support for the classroom teacher; Many' 
Observers of ed^gafciooal testing argue that the tests coimnonly in use fail to 
provide information useful in the" practice of teaching. Indeed, some observers 
argtfe that of the primary purposes tests y are expected to serve — instruction, 
accountability, selection,, ^jid licensure — tests *1serve instruction least well. 
FOr example/ the report of a recent National Institute o£ Education "Conference 
on Research on Teeing" concluded:" Instructional guidance is the educational 

, activity which is. least /served — some published articles have sai^ not at ali_ 
served — by existing tests." (NBS 1979) 

Serious questions about the instructional value of tests have been 
raised by several recent studies of the role of educational testing in the 

n classroom. One study,' conducted by the Center for the Study of Evaluation at 
UCLA, found that teaciers rarely use standardized tests to guide instruction. 
Instead, tfrey rely or. tests prirarily to_ confirm judgments about students made * 
in ; other, ways; (Yeh 1978) Another study, conducted t>yj the JJniversity of 
Pittsburgh, found tiat while teachers make frequent instructional decisions 
about individual studants^ they seldom if/ever use test- results as a basis for 

^these decisions. {ResnLck, Salmon-(5a£x & Jlfcrovl 1980) And in a survey conducted * 
by the American Federation of teaojers, a jnajority of the teachers surveyed 
reported that tests dp not prc^fe sufficient information on instructional 
materials and activities! (Keirtole/^xhis volume) \ ; \ _ ; ( _ ..T _ 

Given the questions . raised % Concerning trie instructional value' of 
^wertiOTal^es.ting,^we_h^e set out ^ to reconsider the fcie^ of assessment in 
the ,, classroom; In particular, we have' focused on Jrhe followiiig problem: Bow 
can ^ssesanent ^ strategies be devised ,to provide information helpful in the 
teaching and learning process? What ..assessment strategies would support 

- instruction in the : classoqm? % 9 " y /, . : - J. 

In addressing these issues, we have been led to a view of the role of. 
assessment in the teaching and leading pre^cess which . differs in si^if icant 
ways from the conventional 4iew of the instructional uges of testing. In the 
literature on testing,^ it is possible to -identify two somewhat distinct ways qf 
thinking about the relationship between assessment and instruction; One 
approach grows out of the tradition,, of standardized ' psychological testing, and 
the other grcws. out of a more J recent concern with learning theory and 
instructional .objectives. Some of §ie ideas we will propose can be clarified 
contrasting them with these two eorw views; {*) . ; - . *. ^ v 



ERIC 



A One view of the relationship ' betwe^h_asses^ehfc ^ ihstf udEioh^is 
based oh penological teeing, Sten^rdiz^ ^y^ologicai testing, of course; 
has a long history and^ : a , tradition of practii^. (fenerally . speaking, 
standardized tests are supposed to detect differences between (individuals, with 
respect to stable, underlying traits or chara v cteristics — such as .visual htonory 
o g^aptttade^^ndo^ 

"intelligence" or IQ test, which was originally developed to .predict hbw well a 
child might do in school. 

_ > ^. For our ■"'pirposes, the most Im^rtant standardized tests are the" 
general* achievement batteries, diagnostic tests, and ; readiness tests.;. -- 
Standardized ^achievement batteries are widely used* In the; elementary grades; and • 
they are designed to compare student performance in broad educational subject 
areas, such as reading r arithmetic, spelling, and language usage. ' Diagnostic and « 
readiness tests are supposed to, provide somewhat more specific information on a 
-students strengths and weaknesses in an^instractional area. A third grade 
diagnostic reading test, for example, might provide scores on auditory 
vocabulary, audiepry discrimination, -phohetIc analysis, structural analysis,' and 
comprehension. V ' 

; _ Standardized tests are thought to be useful in instruction because of 
a belief in their ability to predict future performance. For ^xample, * reading 
readiness. tests are often used at the end of kindergarten or the beginning of 
, first gtade to predict whieli ^ildren will have diff^ readt 
And sten&rdi^^ ac^ievemerrt. tests are < used to group children for Instruction, 
under ~the assumption that children %ith similar scores have similar 
instructional heeds. ' . 9 

The intended role of' standardized tests in instruction is somewhat 
: similar to the role of diagnostic tests in clinical medicine. Both educational 
. tests and medical tests are used because they are expected to be good 
predictors; ffedlcal tests are used* of course, because theyj are helpful In 
predicting the presence ^or absence of disfease. Similarly, educational tests ace 
supposed to predict the presence or absence of learning problems. (1) ■" 

" Tlie Hnedx^al model" of , educational testing suffers ^ f rgm one main 

defect. Unlike medical "tests, current standardized tests -ptcvide / little, 
information useful in what might be called "differential diagnosis. In spite 
of the fact that borne tests are labelled "diagnostic," they generally prwide 
little _speci^c_guidance about ja'Stu^ weaknesses^ While 

standardized *Eests sometimes predict stdSent performance, they rarely help, 
explain why students perform as they do. • — 

7 - - •• .* - ... . ..' • .. w. '■ ' 

* . , ■ Hiis defect in the "medical model" of educational testing may simply 
indicate that researchers have not yet been able to identify aM^rfrifeasure ,,the 1 
underlying traits that influence learning. Or, the defect </ may be more serious. 
Perhaps, as some members of our panel argue, the notion of measurement, 'borrowed • 
: - f rem the physical sciences, is inappropriate when applied to human talents and 
abilities* (Schwartz^ Taiylor & Willie, this volume) t ■■ - * 

In the last twenty years, a. second, somewhat distinct vi^w bf the role 
of assessment in instruction ,; has emerged, drawing in part on experimental 
learning theory. In view, . tests should bi designed, not to detect 

individual differences off underlying traits, but rather to assess student 



ERIC 




progress toward _ explicit Jiiistructibnal • oH^ctivesi Tfiis . emphasis, on 
instructional objectives^ i^shar^ .e^c^tidhal 
iruiovatidns, including programed instructiori^ criteribh-refereric^d testing r ^ 
dOTiadn-.referen^ed ^testing, and mastery learning; y : 



TO : develop an objec tive-based test :in a particular r sdbject aseS; it |s , 

' necessary to divide^Eh^su^^ 

■; domains i One corrmon way of doing this is to 'postulate a sequence . ^of instruction?- 
±Z leading from lower-*level to higher level skills./Once a sequence of objectives \ 
/ ; v;is established, .the grole of assessment in instruction is straightforward^ 
; - St6dents are? tested at the beginning of eacii y iristf Uctid^ 
^ the areas in which instruction is required, and at the end of each :> uriit t te> v 
• assess mastery. '.' ■ /■//'•-'' ■"*"■■■', -'• ^ 

f _ Although th'e movement to develop objective-based tests is ?till young, 

* several ' questions^ 'can be raised about the instructional value of the objective- 

* based tests currently in use. ETrst, the effort to divide subject areas into *■ 
sequenges 6f objectives and sub-objectives often results in systems that are 
extremely ^forge; The Individualized v Mathematics System forV example, an • , 
objectives-based arithmetic curriculum for grades 1-6, invblves 393 objectives, v 

: organized into 11 content arelas'lnd 9 levels of difficulty. Because of the size 
of systems like these, they • are often difficult' to integrate with Other" 
> 6 ihstructiOTial materials and activities,, ' ,./■:■ y, 



• ' / ; Per-hapdr more '* Ijr^jtant , the division ' of : subject areas; into 
instructional , donafils of ten ;¥ -V^^ especially for objectives-based 

tests that are not linkfed to particular curricula. In general , little empirical 
work / with children has 'been done to determine whether t^e instructional ; domains .. 
that have f been curved out have any instructional signif icance; ; : 

Finally*, instructional ob j ectives systems of ten emphasize rote skills ■ 
at the expense 'of conceptual imder standing, f lhe^ division of subject ^reas into 
snail mits often prodices an arid, *if hot Atomistic or reduqtioitlst fcpn^pt^ ; 
of knowledge^ 1 ; ■ >.t ■ - ... ; y*' : •• >;: - "., . : " •• ' . .' . . £ - yh- 

• ; [ "'•••in" 'siimq^^i\/ ^ere^at8 •' i ^6^'^pul^ r views -of the role of assefe^ent ^ 
\ in iri4tructio^. ' -One approacjti has involved attempting to integrate ^andatdizecJ' ' 
tests y and instruction, through* a mod^l ^m^hat similar to f m diagnosis.;^ * 

The other approach has invplved attempting -to integrate, fbjective-based tests £ , 
and instruction', "by /.organizing instruction in small, discretely mit:s, so that - ' | 
tests can be inserted^Jong the way. ■ v V ■', . a ; - ;\ 

; . |fetiev4 l^tiiafc; v^aiere -If ^?^*J^^'y^4'^A ^i^in^;^at^ th^- 

> relationship beti/teen assessment and; insfcructt^ ^ mjay b^ more . 

helpful iru developing useful assessment ^% .1 

discission 'of assessment by asking ifow fee^s ^hbuld be developed, ^ we think it 
more ^helpful to ask how teachers, in their ordinary^ day-tc^day^ 
; experiences", £igu,re out what their students know. * * ' v ^ : 

"s 4 ^in tzhei act'of teaching/^ ask questions • 

;ang make ' jud^fifentiSi ^^^r^c^tihuo^ly ask- thrives #he^her N "this child 
'understands, a 'pa^iftic^ar cdncefe whether that child Should 'spend more time in t 
reading rather than social stales, whether tf^s ie^ , 
^ .whether thMti lesson^ i& mp7ii^r^^s?a^d^y or tco^lowly i ; ^ac|iing 'itself;, is. "a::- : ^ 



ERIC 



V •'>'. believe it is helpful tb : view fbp^ Assessment niatei^^s t§st|- 

* - assays; of expani^ng . upon: the ir^uiry. prbcess al rea^ inherent in\teachingi 
^Assessment materials^ in this .vifw ate. nbt _s^tHihg separate f ran -instruction^ 
: ;sanethirKj to be usfed bef ore^p^ after^instruction f but are instead something 
v/Whic*v is ; a continuous part of *the ; act of teaching itself,- ; • f**-'- 

l.: v : ,•' From this perspective, : assessment materials ' might be cbhceive<^ as 
« materialiS; riiMch like regular classroom exercises, ^tas^ and. games — "but 
(tesi^ed, tb provide a bit more irifomatibn ; about how a student is thinking and 
what a,/^ Assessment material? should provide teachers a way 

bf 1<^ regular classroom work, to-'see why the work 

; was ^^m\^.'W^ : -it/ was.; : ^ ■ i "i,... v i - : 

_ •' ; ' : fy Assessment materials of this kind might help teachers arid students in 
several ways; First, they might help a teacher find pattern arid order in the 
strengths arid weaknesses appearing in a child* s work. For example, a teacher 
might notice that a particular child has difficulty forming plural nouns, and an 
assessm^fit Exercise focusing on plurals might call attehtion ±o some potential 1 
; Sources of the problem. Another child might perform erratically: on arithmetic 
. word problems,^and an assessment game might help, determine whibh sorts of words, 
problems are causing difficulty and why. 

\\v^^-S- . * Assessment materials of .. this kind might also serve another purpose. v 
j ; ;:'^:[ ; ;-:.vTli'ey might /help teachers ccxtmunicate with each other about individual children 
. v and" their : work. • For example, such assessment materials might provide well- 
focussed/ concrete examples of a childss work, so that teachers 'can discuss 
: - problems and suggest solutions. In the same way, assessment materials might' 
help teachers comnunicate with parents about specific strengths and' weaknesses. 

Altogether, assessment strategies of the type we are proposing; would 
: have three main characteristics. ' First, they would help ident if y^ regularities 
underlyirig the strengths and errors in children's work. Second, ^-th^ would 
• respect diversity among ' children, and they would draw on ^dfiiidren's/ life, 
experiences in their own culture. Third, vyuld sgrver >as tfefe basis for 
, : dialogue — among teachers, students/ and parents. J-'/ ■ '. +F~ 7 ~^ ' - 1 ; . • -t~^~ 

o In the report that follows, we develop this alternative view of the 

rolfe of assessment in instruction in some detail. The repor|:,;containsV five 
".*, r E>arts. In Part I, we discuss the problems teachers face r in ' trying to use 
currently available testing materials.. In ^rt llf} ^ the main 

philosophical themes underlying our tvi^w Pf ^sssessmeht and 1 instruction. In 
i-./ Part ill, we draw on these themes; to outline some. ; of the characteristics we 
- ; ; believe, ri^- assessment materials ; should possess,; arid in Bart IV we* describe 
v : if project ^6|e^i^ develop assessment practices that em^pdy * some o£ the 
^ i^a^^scusiy\|n Part V IV,. Firmly, in'Rart V,. we offers somV rea^fendati^is 
foi vt|fe dev^ppntent : of new assessment ^materi^ls, we consider what it might^eost^ 
v 'j -H- - ■ta^ : iiifc^:; , ih the ;di rectioris we describe;, ; and ;*;e -suggest sc»ne prganizatit«al and„ 
|»itical strategies that inisht^ promote toe prac^ibes^ we pr6^e : ;;". v .' : • ^ 




ERIC 



' :*:+y. ■:?*<- % >: ■•: 'r: ,V-:> 



Lauren Besnick, Leslie-Salmon 0 Cox, and Lee Sproul, THE SOCIAL FUNCTIONS OF 
EDUCATIONAL TESTB^ f University of Pittsburg,, 1980. ; 

Ralph W. Tyler and Sheldon H # White, TESTING, TEACHING, AND LEARNING: . Report of 
a Conference on Research -.on Testing, National Institute of Education, Washington 
D.C., 1979. ■ / • • . :- r -^) ■ : .' .. " ■. J / 

. ■■ • • J • •'■ •• '•: • .. 

Jennie Yeh, .TEST USE IN SCHOOLS, Washington, D.C.: U.S. Department of Health, "< 
: Education, and Welfare and National Institute of Education, ' 1978.x One major 
• study, of the role of tests in the classroom has obtained results quite different 
f rom those obtained by Yeh and the" others discussed 'above. Michael D. Beck and 
Frank ^ P. Stetz, of the Psychological Corporation, conducted a large sample 
v-r survey;' of teachers,: and a substantial majority of teachers reported using tests 
■*• for- instructional purposes.' One^ reason for the discrepancy • in ' the results of 
these.- studies may be that the studies discussed above asked somewhat! more 
specific questions about the role of tests in instruction than did Beck and 
.Stetz; See Michael D. Beck and Frank P Stetz, "Teacher Opinion of .Standardized 
i Test Use and. Usefulness, " Paper present to '; the American Educational Research 
"y Association, Sa^ Francisco, April,' 1979. ■ ' ' . ■ ' \v ■■ : , x :i . ••.:.•.« 



NOTE 



1 



' ■ %, W\ FQX: example, new borrninf ants ■;■ are often given a test that measures the 
• Z'..'^ level of blbbd ' pher^lalinim, because the level of jDlood phenyialinine .^is 
^ associated with PKU, an inherited metabolic disease. Infants whose blood levels 
i > are above normal . are more likely to have PKU than children with low blood 
■L ^levels. '^Al ; :_ '''i:V"j^ * ' * ', 

^1/ Like^edu^ticsial tests f medical tests are often far less than perfect 

i r predictors. Nat all children with . high level£x>f blood phenylalanine f for 
- example, actually have PKU. Babies, who are premature sometimes show high levels 
of blood phenylalinine. \ ' ■ ■ ' ■> • ~ . .;■;/ 

Foj^>^ useful account of the predictive, model and : the role of 
' diagnostic tests in medicine, see Robert -S. Galen and S. Raymond Gambino, BEYOND 
NORMAL ITIY: THE PREDICTIVE VALUE AND EFFICIENCY OF MEDICAL DIAGNOSIS, New York: 
'•■J: John Wiley and Sonsy 1975 • ( 7 , * 




PART I , 



liffiNT;AND INSTRUCTION: WHAT IS. 



'*■ ' ^ Educatibral^ achievement testing is a familiar feature, of elementary 

and secondary school life. Achievement tests are widely administered, and their 

. results are periodically reported to teachers, school departments, parents, 
government agencies, ahq^even, from time to time, local newspapers. We begin 
.pur discussion of educatrpriai testing with a series of questions. What kinds of 
tests are generally used in the schools? what assumptions about' teaching and 
learning do these tests reflect? What role do the te'sts play in classroom 
instruction? And. hew well do they ; ^erve the teaching and learning process? 

' i. ' ''I':..- ' ' ; . Eva Baker, Profesor of Eduction and Director of the Center for the 
Study of Evaluation at the University of. California at. Los Angeles, provides an 
: . ' historical overview of educational testing in -the United States. Baker begins 
her paper -by discussing the origins of standardized psychlogical testing and 
- educational achievement tests. As Baker points out, 1 standardized achievement- 
! . testing. grew out of an effort to identify individual differences on underlying 
traits _ largely for purposes of educational prediction. Baker • concludes her 
discussion of standardized testing by raiding some serious questions about their 
instructional value. :.„>'•. ..; . - t ' :j" • • 

. v In tne last twenty years, a second approach, to educational testing has 

emerged, partly as a result of criticisms of the intructional value of 
.. conventional standardized . testing. Baker outlines the development of this 
second tradition — sometimes called objective-based or criterion-referenced 
testing and discusses some of the difficulties involved in specifying 
instructional objectives and using criterion-referenced tests in the classroom. 
One response to these. difficulties has been a recent shift of attention from 
instructional objectives to learning domains. Baker raises some questions about 
this recent trend, and then discusses some dilemmas that must be considered in 
the development of new, more useful assessment materials. ; ; 

In the .following chapter, we turn from an analysis of the assumptions 
underlying conventional educational tests to an examination of . the role these 
tests play in the teaching and learning process;* "Eugenia Kemble, Special ' 
Assistant to the President of the American Federatioif of Teachers, reports the 
results of several surveys pf teachers opinion about 'testing and draws some 
: conclusions about the kinds of information teachers would like tests to provide. 



'■■■}* *- . . According to, a. survey conducted by the AFT, . teachers believe current 
tests provide insufficient guidance for v^s|?in^<tl,r?^ : -iarv^ indicates that 

.^Hchers; desire assessment ; materials that provide '. more information, about 
individual students, and especially about student strengths and weaknesses. ) ■ 

Kemble then outlines some characteristics assessment materials should 
possess, if tests are to support teachers in the practice : of teaching. : In 
particular, Kemble argues that r fehe traditional distinction between assessment 
and instruction should be reconsidered. Assessment . 'materials not only generate ; 
informatipn about students. , .They, also influence what is . taught and' what is 
•learned. Thus, it is essential that educational testing materials . refl 
depth and diversity of the aims of education. • - : ■ ' 

• , . ••>• •• • . . . • • ' . rf 



ERIC 



chapter 2 ' : — % 

tests and instruction: 
• •» "•: • am historical cvekvifj'7 

; ■ i • < Eva L. Baket V 



University of California 
Los Angeles, 

Educational testing {provides accountability; testing raises standards and 
facilitates learning; ..'educational testing proscribes teaching. These 

.paradoxical i^ of testing occur partly because of the operating 

understanding we have for the enterprise. Where do these understandings cone 
f rati? What is an achievement test^ a . test? Do we know 

whether * the tests we. have are /the tests we might have? Should tests be 

; resurfaced f remolded, or retained? How did we get what we h^ye in educational 
testing? ^ ■ •• • V 

In common experience, tests have come to mean "trials", as in the ^trials by 
fire; suffered by mythic heroes. ' Tests are endured because of the rewards they 
promise upon success. In the £ense that mettle is tested and ability found out, 
tests are thought to have revelatdry power. Hiey ^investigate personal liinits 
and Secrets; they display what people are inside. Tkcit acceptance of this 
revelatoi^ potential is, in part, what makes/people anxious about tests. 

Tests and trials are also terms common to thfe language of judicial 
procedure. Legal "tests" create additional nuance for our definition because 
courts aire convened to discover the truth. In law, truth is to be determined 
£ fairly, and due process requires that particular rules of demonstration and 
evidence be followed. ^ f 

Tests are also employed in * medicine, to verify or exclude alternative 
causes of particular syirptoms. in the realm of science , tests . are used to 
examine the tenability of hypotheses. And in engineering and applied science, 
tests may be. used in; reaching critical decisions, (for example/ to determine, 
x whether .. a machine _£ alls within a band of acceptable performance) or they 3 may 
'function more simply as observation points within a carefully specified set of 
conventional procedures. 7 : • 

■ The word "test" has been woven into pur most casual conversation, partly hp ^ 
. doubt a? a result of 'our fascination with technology, arid with " research 0 and 
developnent. Test pilots, men who braved the dangers of new supersonic aircraft 
two decades ago, have now been reified by inversion:; instead of people, we have 
. events;, "pilot tests" stand for the tryout of something under development, a 
trial which occurs, under conditions of at least minimum verisimilitude. 




,. Revolatioh, ■ . p^cholqgy, 'law, medicine, science* and technology are high- 
status sources of the cbnttdtations asWcia Undpul^ediy, < all - 
these udes and under standings of "test" sdmehw contribute to;,: the range of 
ihterpretation evident in educational testing ahcl* as certain* ' all occur against 1 
...:.a. background i&cioecoribmic system based ; bh competition. But as influential as 
these eonnbtations may be, the specific applications 9^ Resting in education 
require further exploration, . 

Our discussion Hatreds,, then, to the uses of achievement tests in 
The principal uses of tests since their inception have been for 
(to decide who belongs in a particular _ class or instructional program) 
for credentialing . or grading (to determine who did how well, or who did well 
enoughV / and for program evaluation ( to find out what changes are needed in 
VducaticHial sequences) / While there, are numerous other uses of tests, let us 
confine di^ssion to, the three identifies for achieyenifent testing* ; , •/ 

' the " rise of t st^^ |& J f-^iM ■ ft 

• • -J-. Ttie relative ^phasis given to placement , -credenti^ling an<3 eyaluati on ha? 
Varied oyer the testing, 
placement or select Api was paramount, Orte 6f the first standardized educational 
tests developed in t5e United States, for example, was, used to select men to be 
officers in U.S. military service. Te'fets were developed so that they would 
detect individual differences among potential officers. The test development 
paradigm was parallel to that employed by Binet in his well known explorations 
of human intelligence, 

_ Ihe prevalent statistical models of the time reiSorced the diff erentiat 
function of tests and provided comparative information about individual 
performance. Of high interest was whether an individual placed in the top, 
middle, or bottom of a distribution of scores. Because reliability, or the 
consistency of a person's rank in a distribution was needed, great emphasis was , . 
placed on the stability of ranking; a per sop who was best on N one day should, 
when readministered the test, be best again, or close to it. Concomnitant with 
t±is_ notion of test stability vras the interpretation of human test_ performance 
as a measure of a stable characteristic or general trait possessed by the 
learner, - 

The importance of prediction cannot be overstated in this .model. 
Philosophically, the model suggests that schooling operates' to sort intto groups 
people of various stable and predictable characteristics thought to v profit 
differentially from alternative instructional regimens. One extension of this 
view can be discerned in recent research by those who want to match a student's 
instructional treatment with the student 1 s cognitive ' style. This line of 
research is called, alternatively, trait- treatment interaction or aptitudes- 
treatment interaction, and the "trait" or "aptitude" is usually measured by 
achievement tests. (Immediately, one should pereceive a basic conflict in this 
view of the function of testing: "achievement" is seen both -as a; predictable? 
stable trait arid as something amendable to change, perhaps through school ing* J . 

Tests of achievement, developed originally in order to differentiate among 
individuals (including ubiquitous college entrance examinations), gained 
legitimacy f ran, a number of sources. First, the tests premised to add Sin 
important refinement to selection processes, a refinement in the name of the 
democratic principle of f airness. It became .socially less acceptable, although 



ERIC 



BAKER 



3 



l^rha^ not less^^ educational elite exclusively 

y from miong the ranks of the wealthy. Ttests sfeemed a fair way _to ^broaden the 
r; iitformation ufised in selecting Students for higher education. Secdrtd ? it - was 
iiti|»rtant to recognize that selection into programs such as V university 
education v or officer's candidate school was regarded for a long - tim£ as a 
special or unccxtmon reward and opportunity f hot within the aspirations of most 
of the population. People voluntarily "sat for^colle^e^ entrance ex^ination; 
they were not recjuired to do so. _ lhus, ; -the tests were accepted as a legitimate" 
tool to identify the deserving few. In general f those selected for college were 
rewarded (ahd were able to afford the option) ; those not accepted were riot 
stigmatized br regarded as failures. s 

. ConcOTini tant ' with these soci al. interpretations^ of testing was the continued 
ii. : «^3^^Qg^St -a, of : : ; ^^^00^^ statistical^ support the 

< 1 ' sorting and •placement' purpose of tests. ' Before world War IlV additional uses fpi 
achievement testing had not gained much importance. Except in p|rticular 
^ professional , or ; technical fields, - and Vin th& New . York St^£e '- Regents 
; examinatic*is r certif i^^n (l^e-. ^releasing of students^ • f rd^ ; ;pr,ograms of 
instruction, or the passing of students from. one program to another) was the 
private responsibility of academic personnel in schools . The teachers' right to 
assign grades was understood and generally unchallenged. Tfests used to ^aluate 
teaching and instruction appeared only sporadically with no concrete impdict. 

j Since World War ,11, the role of standardized testing in education has 
expanded markedly. What forces account for this expansion? First, one may 
point to the democratization of ? : sciiooling_ and the delivery of universal 
education. Jfore and^ more students ^ttendfd high school. Graduation becobe 
expected rather than exceptional. And through the effects >6f legislation 
designed to reward those v;ith- railitary ,serevice f college education became an 
economic possibility for a more diverse set, of students. The : student loan 
programs and the rapid grcwth and variation of the higher education system 
changed normative values, and students in increasing numbers planned to go and 
factually went to col] ege. . .. . ... ^y./ ' 

While the_ expansion of ^schooling helped promote grwth in the use of tests, 
undoubtedly, the single most identifiable influence in the post-wa^ v field of 
educational achievement testing was the federal government, in both its direct 
, and indirect effects. The federal establishment supported educational research 
in the sixties on a scale unlike that experienced previously. Psychologists and 
educators, in their thrilled (or, at least cheerful), exploration of 
instructional and curricular variables almost exclusively depended upon the 
growing array of coninercially availM)le achievement tests. Education schools, 
began to shift from predominately teacher training "craft" centers to bastions 
of educatiaial research. If much of the education research produced in the 
sixties was not heart stopping, nonetheless, the availability of federal 
research support grew (such as tftat offered by the Cooperative Research Act of 
1963). :i ■ /. f ■; '• • •• ■ - ; . v . : • ' \:> ■' ■ ■ ■ - ; 

2 In parallel, the expftnsiohof higher eduration, wiffi the strict impositi^i 
of "publish-pr-perish" s criteria, fostered dependence upon the production of 
science-like educational research. Waiting to play their part were standardized 
achievement tests. The direct and final push establishing _ the 'legitimacy of 
achievement tests came from the great investment in federally inspired social 
grid ei^catignal.prgg^ the I960 1 s. In Title Orie> a program to improve the 

learning of disadvantaged students* in Head Start , a similar program fran a.- 



different bureaucracy, the go^ernm^nt required evaluations of student learning 
as a measure of program effect* "Owning a sub-speciality of still growing' 
proportions, arid aped as they were by both state programs, and . state w^uation 
requirements, federally mandated program evaluations fixed " achievement tests as 
the criterion of choice for educational evaiuationv If the most . competent 
educational researchers, by and large, acre^ed such tests as adequate criteria 
to judge" their theories with, hardly a blink, one could similarly assume the 
suitability of these . tests to "determine program quality. Ctoleman used such 
tests to assess ; the pol icy implications: of segregation, and Jencks continued 
that tradition. % 

- - . ' : ^ ' ... : : \? ' 

MthoughH^ai?^ . were early and vocal dissenters, the testing industry grew, 
encouraged and supported, by and large by 

"experts" in univ^^ties. Percentile rank^, stanine scores, and reports of 
indivicfoal ^an^ repprted 
reading achievement spores, and fechcSl boards cameVand ^ent on \ the , strength' of r 
such scores. Etopfe "test" performance became a scandal/ and test scores served' 
as a marker f or ; ; "laiiCTtipnal quality". Repeatedly^ . convent ibnial : wisdom about 
the .-. effects of one or anbthei clearly different programs was , swamped ; by * test 
results tha shewed no difference in achievement among differing concentrations 
of resources* It is only lately that;^ of test scibf es as measures of 

educational effect have been challenged. 

• ■ - . • ■ • ■ - ■ ' ■ '■ ■■ ■ ■ - . 

So r advances in educational psychology, sunny optimism' in the federal 
support of . school interventions and research, the rise of higher education 
supported again by the g&veirrlment, and the blessing of achievement tests as 
satisfactory devices to measure educational growth conspired to create the 
climate of assessment we are faced with today. Particularly: 

1. Tests of achievement evolved because of needs to choose the 
best students for sjjbcial opportunities like college. 

2. Tests were designed to measure stable student characteristics 



: ___ 3 . The brbadertthg qi Student ^^atioi^ahd the value placed 
upon higher education contributed to the acceptance of standardized 
achievement testing. 

4. The collective desire of educational research" to : approximate 
science, combined with the expansion of higher education and salient 
tenure decisions* inexorably increased experts dependence upon 
achievement tests for research studies. , 

J^ievCTent t^ts were^also used to ^certify students. and to 
assess program efforts, although these uses came somewhat later ; 

6. Federal and state educatibhal programs** requi ring evaluation 
of innovations fed the growing testing industry. ^ " 

• 7* Test scores, legitimated by these, and other well publicized 
and influential events such as the Coleman studies, created a climate 
where testing is seen as an essential component* in educational 
prbgraris and the single best indicator of e<teceticaial quality* 



BAKER 



5 



sfeNDflRDIZED TESTS AND INSTRUCTION ' 

The tests used as "achievement" measures in the schools" have mainly been 
. conjgercially available, standardized tests, usually dealing, with a subject" 
matter or skill area* such as mathematics or reading comprehension. The fact 
that most achievement tests, are both commercial and standardized, has had • a 
large influence on the role of testing in instruction. 

: . . The commercial character of such tests has led them to be considered partly 
in terms of their marketability. Tests which are marketable are those with the 
broadest appeal, that is, thosey least tied to local or idiosyncratic needs. 
Thoughtful scholars in this field have pointed out that " the requirement to be 
general" and appropriate for broad use conflicts with the test's function to 
detect particular effects „associatd& witfeidentif ied programs (in program 
evaluation contexts.} Sampling a broad field* such as reading, with a test not 
sensitive to particular pedagogical methods can (and- does) produce results that 
inaccurately portray achievement of students. 4 , 

An even more insidious problem has been identified by Porter and his 
colleagues in Michigan in their analysis of mathematics and reading standardized 
tests. The technical manuals accompanying such tests describe the general 
topics to be tested, but review of the actual number of items used ; to -assess 
different skills varies greatly and could ; selectively penalize classrooms of 
Students whose instruction has not matched the same set of content emphases 
Remedying the problem, that is,/ trying to * select a test which better fits ' 
Particular instruction is a course of action hampered by concerns for test 
security, a topic to be treated more extensively later. 

.The term "standardized" brings with it additional difficulties in test 
application. "Standardized" refers to at least two different, but interacting 
features of testing. One interpretation of "standardized" relates to the 
conditions of administration of the test. Wherever the' test "is given, a 
particular set of uniform directions is used; a specified amount of time' is 
allocated; certain pencils may be required; .common answer sheets can be 
provided; student questions about the test may (or may not) be answered; 
instructions to guess . may or « may not be given. The standardization, or 
exchangeability, of conditions of administration undoubtedly contributes to the 
test's special and ceremonial qualities. Such tests could not be given 
comfortably within the regular classroom daily life. Special rules are used and • 
these rules contribute to -the distinctiveness of the testing occasion compared 
with other classroom events. Atypical conditions likely affect anxiety most 
obviously for students, but with growing concern by teachers. The 
standardization of conditions, thus, underscores the foreigness of the "test and 
sets it apart from "normal" instructional activities. 

A second interpretation of the term "standardized" relates to the 
standardized way in which test results are to be interpreted, in common * 
practice, the standards used to interpret test performance involve transforming 
raw scores (the number of right answers on a test) into formats that, allow 
cross-student or cross-school comparisons. Test scores are most often scaled to 
produce a normal distribution, or what is sometimes called the bell-shaped 
curve. The reports of students' or schools' achievement levels are then 
converted to relative scores: A student might be in th^ first quartile (the 
bottom 25% of a distribution) or the 85th percentile, (with 85 percent of a 



o 

ERIC 



BAKER « -6- ./■ . % > ■ ■ ■ , " ; ;'' * ' '•: 

comparaljly tested group scoring less well) f or^ (placing the 

score spmwhere in the; middle of sgst of t ScoreSi jj: i^prffiation is focused on a 
student's j^hk in a distribution rather than the-^.ud?ntls • level of skill bi- 



v This transformation 6f^ _,scbres^ into* v"..i:a^\sEar^rdii^_- f rameworte for 
interpretation raises two Related issues* Firsts :hpw. Appropriate are the groups . 
used to "norm" the tests?. ^ 

socioeconomic characteristics of the students chosen f or because of - th^ time 
elapsed since the normirfcj process took place. For irisbarice^ should test scores 
from Minnesota in 1974 be compared With those in inner taty Lbs Srv^i@s iii 1980? 

Second, hew does tHe effort~to iTOUte ^ ,rK>nnai distribution of test scores 
influence j^e relationship between testing, a^ prd^r v for 

this "normal" feat, to occur f each students should have, about 50^50 chance of 
setting each item right. But clearly f those- .items on which : instruction has 
focused would ; haj£e^ success, levels much higher 'than • Paradoxically, p such 
items would be excluded f rom' achievement tester for. being; ; too *ect£y. : Aftd . -the 
paradox extends. Because tests aire ifeli 

instruction to improve test scores (even» if such'coidd ,.; ^ :.jdohe) is regarded as 
unethical : if not tierverted. 4 ; ' : : ' - ' ; ^ s 



Thus, commercially available,, standardized, noriTr- referenced tests create, 
the potential for a logically strange set of conditions > if" ^ not siinilltaneously^ ' 
at least in sequence: b ^ C * V - : 

Tests that are carmercial have to' be general rather than spe'ciJ|ic f arid> 
ye they are expected to serve potehtialiy idiosyncratic local program . 
requitemOTts? * ' ■' :: ' ■ \ ' ' ■ \.V'V" 

' '' _ ^ '_ ^_ « «^ "■ „ *j , __, 1_ •» r _■ _i 1 

The administration of tfests that are standardised requites some degree 
of foreign and. special procedures, procedures which withdraw such 
tests from the day to day regularity of classroom instruction. 

. - Tests that dre standardized using some' f orrrTbf the nomal ; disttibuticxi 
'; provide relative information' {who's better ttiari . wl^X aind they ..are ; 
most efficient when , a student 1 s chance of success on each item "is 
about 50-50. , y "V' :: N": • ""^ ., ' 

The gener&l.; ^^relationship to . instructional . programs of • conmer cial , 
standardized tests is therefore* weakened by^ 1) general rather^ than specific., 
content relationships ;T 2) the loss of informatiai by providing Relative data; 
and 3) the . need to ^discard potentially instructionally^ ^ 
approximate' the normal curve.- Q^eirall, then, the- facts " of dev^opmehtl 
administration and scoring of norm- referenced tests tend to weaken their utilit^ 
for instructional planning, and therefore, result irt. reduced appropriateness of • 
sqch - measures " as . indicators qf the effects, of instructional programs. The 
process, is unfailingly interactive. ^ Notv;ithstanding, acfiievonent tests continue 
to have wide inf luence in public education^ , 



IS 



\ THE; ©ERGEN€B OF ^EerI\raS-BftSffi> TESTS 

The linage among . achievement tests, / academic psychology, and educational 
statistics, represents by no 'means the' only approach that has been "taken to the 

-•assessnent of achievement. A contrasting perspective argues that tests should 
not be xevered principally as instruments for measuring, the "true capacii^" of 
Individuals/ bufr should instead be" seen as instrumental in the; teaching and 
learning process * itself . From this ^Itspective constructs binding tests to:, 
curricuiar ' or instrucrtriorial -requirements have greater ; utility in achievement 
testing than do constructs of huntan capacity and individual differences. This- 

-vi^,. dates from early in the 19th century, when Rice tested . spelling ^ 
per f onnance, and it extends to the domaiff- referenced testing advocates; of today. 



Attentic/h to ^the control : of instructional events has led tb^a renewecf^ 



^hter^st in demonstrating the short-term effectiveness of 7 ^t rruct i on . , d<e^ ; 

examples" of this effort can be seen in the programmed instruction V iriqv&neht of 

early 1960s. V E^ogr&nfned instruction was- based'von ^:he idea 
o :^at^earriing could proceed incrementally^ and, to some extent, didgriostically. 
^Vr^tms^, : that is "repro^cible sequenceis of ^instruction" wereHgesigned in 

order tip cpntrol performance at every step of the way. Learner 1 s responses were 
vic^ref liijy :tifianit:or^d« ihterrogafcipns of - "wl^r" students may hatfe responded one way 

or another way were the subject of research studies and recommendations. ,. 

; ° . The principles upon which these **pr6grams were based share some of Q the ideas 
promoted and debated by current test designers. For example, an early schism on 
-res^nse mocte divisions bf the field. The. debate was 

whether it was best to ask respondents go construct or to select answers. One 
side f avoted .gradually increasing the item difficulty of a learhier-produced 
response; order of instruction was fixed, but students' time- to-coroplet ion , or 
rate r \ could Vary with individual differences. The other side focused on 
multiply choice responses that built in attractive error options and appropriate 
remedial* 4ra truct i° n contingent upon selection of different wrong answers. 
Thus, different students .. might experience very different presentations and 
orders determined by their error patterns. 

SegmOTts c5f* instructional " programs, palled "frames, were designed to 
correspond to items on achievement test s . in a frame, the learner received a 
stimulus [ that presented* *the 'minimum number of features thought necessary to 
elicit a correct response. By careful trials/; these frames grew gradually more 
difficult, so that at the end of the program the student was successful at 
ccroparatiyely difficult tasks; 

The experimental psychologists who were, the designers of programmed 
instruction were interested in making these* sequences both effective and 
efficient. They contrasted prompted frames, where the learner received "help", 
*with .uriprmpt or criterion frames, where the learner performed the task 
uhadded. Prompts, might, be formal* e.g. line length or thematic and 
substantive> capitalizing- on preexisting information possessed by the learner. 
-With support from research, 'many psychologists believed that prompts should be 
"faded" or withdrawn, so that c students became competent as quickly as possible. 
The term "l^an program" was coined, suggesting that * anything not demonstrably 
instrumental to performance should not be included in the sequence, t^ile this 
emphasis on efficiency had a ."Gee-whiz, look ho hands!", appearance/ the effect 
was^ to try to express, as concretely as possible, the particular set of 



l^rforinSrice tasks to indicate that '"mastery" iDeen abquired. . - 

a: On the issue of how one determined standards of either minimal or expert 

"':' Km ' performance, most program designers chose pleasantly redundant numbers: 90-90, 
meaning ninety perecent of the students vzere to obtain ninety percent ; of _ the 
* items correct, 80-80, and so on] Military training directqrs set high ^tatKlards 1 
(that is, 90-90) .. This formal proclamation of standards (the program would need 
to be revised^until *the criterion lev^l wgs set) sometimes resulted in the 
. selection of easier tasks and the development of simpler items, so that -90-90,; 
or whatever, was more practical to achieve-. : ^ x^L<- ; 

Many svell>rcspected, present clay scholars in cc^itive science expended a '} 
good deal of ^eir early sch^ conditions'; - m which programs " 

were* more' successful and. efficient. The effects of the spate pj^ experimental V 
^zk : ;^^ form |Pr ^^i^em^nt * ; 

-testr ^ called i^;^^c^mmed instruct IjDn parlance , a criterion test^ whose 
, ptactiral; ^ ; that the frames were unprompted. When Glaser first 

infero^ 1 criterion-referenced tests, his use "of the term "criterion" 

Was : interpreted into two different ways: 1) "criterion" performance was the 
: f in^ or set df tasks; 2) "criterion" referred' hot to the set of 

tasks or skills to be performed, but instead to the level of performance to be 
exhibited. Glaser 's article stajtolated a grrat deal _of work|_attCT^ing to 
extend his definitions. Yet the confusion between criterion set and^criterion 
levej remains, and it is reflected in many of the criterion- referenced tests 
developed. 

What" must be seen as most^ significant about the emergence of this testing 
framework as an alternative to standardized, norm-referenced achievement tests 
is the context from which such ideas emanated i_ Criterion testing* however 
flawed and imprecise at the outset, grew as a natural extension of instruction^ 
..; These were tests to assess instruction, tests of specif ic_ anc3 generally 
replicable teaching. The early definitions of . programmed instruction emphasized 
its "reproducible" quality, and the teqn first stood for things that could be 
"dittoed" and later, Xeroxed, like paper and pencil, programmed booklets. The 
definition later expanded to include "a set of events ... essentially 
reproducible. " Although the application of "criterion- test" to classroom 
instructional settings processed rather s^ with awkward control 

mechanisms (such as "scripts" fori '^dher-^ud^t int^a^i^, in time, the 
idea that the teacher was principally responsibK for^l^tpdctional consequents 
was here-and-there acknowledged. > T . 

The translation process f ran "programmed" to teacher- led instruction 
stimulated a number of pertinent developments in the field of criterion- 
referenced ^testing. For one, the tendency to specify criterion tasks 
exhaustively, in the form of, highly specif ic x behavioral objectives, caught hold 
in public educatioa for a brief time, and was even made statutory in some 
places. Some delight was ; found those who enjoyed concrete_ experiences^ Fbr 
example, hundreds^of reading objectives were identified for_ a single semester's 
instruct ion, and Michigan State _Univer si ty ijiitiated* a teacher training program ' 
with literally hundreds of objectives and tests specified. 

■ A review of many health education programs demonstrates that behavioral 
specification still has a heme. Hit two factors have diminished the zeal for 
behavioral objectives, in all but the most fortified behaviorist encampments.: 1) 
the information load large systems of objectives place upon teachers; and 2} the^ 



20 



BAKER . \ . . , . 9 , \. r ; ; * , 

nagging concern that objectives such as "The. learner wii^ able to print a M j" 
. have a relatively small set; of iteans appropriate for measur^neifbi 

A concern ; for legitimating "6ri terion-ref er^nced" measures stimulated 
individuals to attempt to apply some of /fche statistical approaches used in 
standardized testing, po analyze test i terns.; Such experiments, in translating 
parametric statistical procedures for use with criterion- referenced * tests point 
up r some rather distinctive differences between the two test types, ■■ First of 
all/ criteria- refeirenced test? are developed to be sensitive to instruction, 
•IhusV following teaching, students' performance tends to clump hear the high end 
: ;0£^ before instruction, students 1 performance typically is arrayed 

.at ; cpmparatiy e lack of • yMriation , in- ■ scores ; first . befpre and 

then af^sr instruct ion- ^has ^s uggested that radically alternative statistical 
prc^dures are necessary. " " ' ' / ■■' — - ,.: * / ~ . ] 

' Simultaneously r \he criterion-referenced ±est;advocateSf lat^v^ffbm ithe 
instruction rather than psychometric side, c^tinu^d .to develop measures. The 
problem test designers had in corntnon was how'Bo describe the '"criterion" tasks 
the tests were supposed to measure. The use of ordinary-language such as "to 
understand" or to "know" in describing tasks was ridiculed by many. Bloom's 

article on mastery learning __both highlighted -progransued instruction'^ 

expectatiOTis ji.e. r iretructor res accelerated this 

view of testing for teacher led, non-programmed contexts. Bloom had demonstrated 
that conmori cognitive processes could in fact be illustrated in many ways. 
Knowledge could be assessed by _rriy ri ad test item formats; similarly so could 
"higher" processes on this posited taxonomy, including analysis and synthesis,- 

The behavioral psychology fervor of the instructional developnent groups 

did hot countenance such "vagueness?*. Gagne f * working principally, to determine 

the/ structural relationships among _learning components , suggested * a framework 
that v/as_ more acceptable to those wishing a more concrete approach to task 
specification. He analyzed a *set of five types of learning, and later proposed 
that any well stated goal , should include. a statement about the learners 1 
cognitive process as well as a carmon ^format in which, the performance was to be 
exhibited. The complementary work of Bloom and Gagne has encouraged the 
aggregation of cognitive tasks under cqiraon "levels" of learning, as one way of 
dealing with the glut of tasks, objectives and test items. ^ Corresponding work 
in the specification of concept learning ha^also continued. 

methods of aggregation for ^jectives were developed using complexity of 
cognitive -process as a heuristic guide, so other methods were explored to 
sjmthesize disparate objectives. In the early fifties 'Ralph Tyler identified 
"objectives" as consisting of two parts: behavior and content. The focus * on- 
specific behavior was first atomized by the compulsive specifiers and later 
understood and consolidated by the followers of Gagne and Bloom. But the 

content specif ieatibri had yet. to be systenatically addressed within a 

measurement contexts Although * the specification of content-behavior matrices 
had be'en used by developers of standardized; commercial tests, in practice these 
specifications principally guided the ^nitial versions of the test, and were 
Less influential following empirical trial. For instance, if an item was found 
to be "too easy", the item might be revised to include more attractive "wrong" 
answers or to obscure the distinctiveness of the right answer. The bases . for 
these decisions derived from the data rather than from any prescriptive notions 
about JL earning. Jftreh revisions have of ten proved difficult to describe; In 
practice, the updating of test specifications has often seemed inconsequential 



once empirical d§,ta were eopeetec3i_ Test ^ci| i^tiohs f J* then^ have J bf ten been 
used, as a rougft; pi*"^ than as jtigdrous 

. guideline Si- Indeed/ itm spex^f ication^ have sometimes been written af ter 'the 
i£ems themselves have already been prepared. ' . ; . p : 



FROM OBJECTIVES TO OTMftlNS! ' ; ; ' ,. / 

Fran the psychological learning perspectivef the detail^js^cif iraticxi o| 
learning tasks collided with one of research's most cherished* notions, i;e. the 
idea of generalization. In learning terms f the notion of "transfer" describes 
the *• spillover" effects of instructional treatments, effects usually thought Jbo 
be desi^^ side effects, for instance lave often been lpclud|d as 

dependent measures f or research studies where perf ormance on practiced and non- 
pjdcticed tasks was cbntrast^a. TO take" accomt of the idea of transfer in 
deliberate ^in§tructional S^toing^ one ^ shout d _despribe _^the categories of 
learning outcomes desired, ^nd teach to these broader categories; 



■ Foreshadowed by the work of Osbourne f Wells Hively set out to find a way to 
explicate content categories. Hively used set theory as a point of departure. 
He provided a model by which instruction could be matched to an identified set, 
;or "universe" f of content to be sampled by test items. , , 

These "domain - referenced achievement t est s nK solved a number of problems 

simultaneously. Fi^st, the formulation led the way to the specification j>£ a 
limited and manageable number of tasks , in contrast to the proliferation of 
specific ^behavioral objectives. In this model, transfer |3sks were incorporated 
as set members, to be intentionally addressed by both instruction and 
measurement, rather than left outside to enter as good luck might provide. Tlie^e 
procedures also explicitly integrated content domains and behavioral 
requirements. . . t ; x A 

Hively* s suggestion was -very J^§impie^ He proposed that an "i'tonjfof^* or 
"shell" be created that encapsulated' the behavioral requirements of tasks, 
explaining the kinds of stimuli to be presented, the conditions of exposure, and* 
the manner of response desired. In addition, he demonstrated the £ ^ecif ication 
of class e s of content, to which instruction pertained, content which could be 

for assessment and to which the learner's i skills : artd 
generalized. Hively argued that the spefcif icatiog of 
d^aihs_ should involved not orlily a descriptiai of content ,_^"but alsb^ ' tfr 
description of J±e contrasts or discriminations requi red to_demonstrate_ thatthe^ 
learner understands the critical futures of the tasks. Tliese content limits* 
should identify rules or guidelines for selection of content^ e.g., "all pairs ' 
of two digit numbers," by enumeration , "all poems by Keats and Hopkins;" or ' 
compr e h e nsiv e example, "all words found on page one of the £os Ang e les - 




The exercise of trying to formulate rules for the identification of content 
limits or boundaries has been somewhat frustrating. It has become clear, for 
instance, that analysis of structural relationships within certain disciplines 
has not been sufficient to permit tte abstraction of seftsible rules to define 
content universes; It has aiso_proved to be _ comparatively 'easy _ *ttx apply this 
process superficially so that content domans "look" as if they have beerO 
created. 

The attempt to produce, sets of representative items has confirmed the 



BAKER « • .^k ::>:■■;':■■ 11., 



difficulties irtf dived in casing, but flively's j^bgrairu Itie 'Hively'^rbacS -has ; >^ 
\ gone , thrgugh a number » of * ^nnutations arni sim|*if i'ed^ y ersibn^ f§p _ teacher 
; tonsum^ibn liaye l^ehVd^yelppea^ as wejl ; ashore j^^exj fo . ; . 

professional orf v a Variety of : issue&r 

' intruding. ^ how msbiy it^ . should be \. 

sampled ; what aretfisjnoit" ■ 
f im a coranbn ^om^n;^ C ( :? >7/\ : K/C4: : * 

u -j: , ^eral^ models of the Relationship between; test, donaa^is and instucbion haVe 

• • V^er^i^-, Ihe.'^Hiveay ^ mo^el* for, Instance./ first .askedV teadiers;/ to generate 

MexeniplaEy instructional plans. By analyzing lesson^ :f eatdr^s f ' *Hively Vahd / his^ v . ' 
S;jsta££ £>il; ^ and Tearttent:^;' f <§a€tiir<^ 

^for.^se^ people v 

^o^urred> . But the -source of ; the^ teachers arid 

# Iratr^tipn^ ^^n^q^i^; jjrp^s^^ . ^ isi^d^ ; tiio^ Ijegriiirig • ^ Cia^t^ry Sykt^' " ^ 
>• developers; ''-gftep painstakingly analysed .: text iftaterials \ and ..then developed . 

spcif ications ahd testa which sampled the content are^ inctided t in these texts^ V; : 

; „ . A secbnd mode.l, has focused on the s^if ic^ioh, of the ddnliS.pribr ta?the V; 
/develppnenf of instructional materials;. In a curriculum development - pr o j ect iri\ C;?"' 
primary ;rea^ing f for example, project staff .developed donal^i / specifications' 0 and ; : . ■ 
v then proceeded to develop instruction! that representative^ domain 4 ^; 

identified. - A similar model, used, in th^ Detroit Public *; Schools*, specified th§ - \ 
, doptainis 1 ' and then acquired <5r dey eloped % wide rapge . of iifetructionaa nateriais- / " / 
that might be used to^ address the domains. CEhese* appr^cHes-ate reptesent;ed in*- • 
figure '1» below. ) •< ■ ■ \ >;';• "W '■•.% , * 



■ iv. "^HClosid systems . . dpen systenis . / 
; "-.^r (curriclar packages);"^ (ecletic instruction)'} 



Instruptiori 



Dpmairi. 



THE CURRENT DI^ETf 1A '1 : < 



y, ■ _ Domain is^ecif ication, of /e6urs^/ is ^ulperable "to a nunter dt bharges. °Tfie ^ 

t?st f knd^ost critical, ^source of ^content 'and. behavior ' spepif icat irons * 
fenicallyi thes^ guestions- are of most concern whert .the domain designers are 
Y very eicj)licifc c ^x>ut their ' selection .rules. " It is^ar 'easi^ to accept a 
^.stat^Sht such * as /^ords . u^ed in-.fou'rtlTl glade 'reading tests'?' as a content . 
^ description thafi abstractiphs^ such as "wbrds of Latin detiv^biop; ndt more than^f 
.^three 'syllables' Vir? length. 11 " T*he^parent legitimacy of losing - text, books • tb ? -5 
^-define dan^Lns^ is,; of course, partly attriiulable ts qu£ respe'ct "f of : -real' 
ijiif acts^ . The fact that fourth grade texts got? published ^i^t n i^y th^t: the> 3 /. 
words within them were ^reviewed carefully and fdtSp to sat^f actq^y. ; 1^'.; '*s^ ; - 
■ '■. ■ '. ' \ W'^-V.i^.^- r -. : >';: : - 

■ : *• ■■ ■' ■: . — V - ; ' V"-^ :'.^^;>> * 

eric , ■ : ^ 'W^A^ 



baker. ■ , • ;,!'.•"> .. , <X2 -. ' * " ., :;.& ,r i - > • . * . 

much easier to question why .Latin as- opposed to' Greek derivations , should be* 
•'&ciu<^ r d;:i,Fi- a dc^n,-iarid why three "syllables should >,be >seiecte<i as. the cut- off; 
poihti ^ Nearly, • however , attributing ^e^tiiMcy : tb ' the>Acontent -in texts 
involves a degree df wish^ul\ : .thinkir^, . and the source^ b|a £he content ' limits, in 
domaih Statements is in need of reviewi \.^,r v< .J- \~ KX '• 



v 'v>- ;-;-f/<' r j^^&;lias.. sought /to- defend arbitrary specification of ^domains (why three 
; £xamples/ijn£ paragraph and not four?) by invoKii^ ,the;^er of collective human 



fie L tir $es p^E^e singly, to j'ud^e'; and -decide * He is quite . .possibly 
l^tiflt'f#^t tSe/^afe art, or even v/hat may^te possible iri the 

; near^ 'futtire* ^Sut an analysis; v p| , Leg v i ^im&te sour ce s of test ^ domains, seems 
warrife f£rha£s direct 

i\ildt^^9f, and i^truc±i<^^^f forts seems to b5 increasing.,- 4s; 



• v ? : : Idi^Ly, s &$^ derive from mature - 
^ y $ v|jqi*ledge l in &ve£$e {$&6id§ . y For examBjLe , in selecting distr£ctor s for 

^ ^siti^^wemd clearly • 

^ r ^infi^ assessed, as well as the range of 

lr'0 - ^xkm^ies; . pvet which/ Jfct^ pe^Qmace should generalize. . 2hus, the knowledge 
Vv de^elope!S . f rcarji^ cognitive • ^c^dlbgy. wpuld be one * important c source of 

• - ^ bught to assure^that the < topics and 

^ : ;illui^^ if |&(^^OTain are those upon which there is reasonable . , 
: ^riseh%is:it 1 f t$iat ,arf the f ive characteristics of declining world "powers? and 
r ^^;/)^Vhd; says ^icSnairi specifications should\attend to 'pedagogical 

, " : toledge^ tontent f^^res are delimited by practical constraints in teaching 
, as well 4 a^ r t^^ur record in sucressf^ully developing certain classes of skills. 
v f no one can teach have only marginal 
.'• . interest. > , ' > : •' - • 

<r Si^ can provide cues aBout the capacities 

• V ' of^^^ to assimilate varioi^ ^rts of skills, as a function of maturation. 
i -I * '^e- sfoula be quick; to f rec6gnize r however > that such perceptiais may vary with 
i ';^^^e theory m 

° ^ Another * irtuch werlboked' area is the language of the test items, and its 
semantic and- syntactic* cfc»nplexity^ The issue of language needs to be Addressed 
mudi njore f , specifically ; ; .,t^ simply reporting readability levels (however 

; usefyiii:^gle f^j^ulas may be for longfet discpurse than typical- test stimuli) ; 
fert|^rarly t as;:^ fpr equity and cultural diversity in testing garners 

^ ^ features are likely to be increasingly . 

vir^li^^^i^r -;^nc3 ^ai^lysis v of linguistic issues should be incorporated . 
sy5t«m^ic^ly ^ domains. ' . ' ' ' : 

•i\ . • l^'X^i-' • --- - — - ^- <— - 7* au' ' ■ - - 

' *5 ; ; ^ w^rs children conf ront test materials should "also be 

, • v l^art|ihea . Analyses of test "frames" from cognitive and linguistic perspectives 
. ■ : - additional v ways to design test and instructional items that^ v 

/ } ^res^ct dwisrsity among students. * 

f : % *Tests>>are hot going to go ^/ay soon, even those that we might personally 
A V 'regard as irredeemable. We should therefore look for tests to beccme more 
-C^-vtiie&^.and more fair. Their uses. in instructional planning may ultimately call 
% -^ir. th^V true merging of instruction and testing. Tests no doubt shpuld becane 

' ' increasingly! available or public, because "of the constitutional ^ guarantees 
? < ifiherent in^sociely's requirements to sort and choose people, for 'schools, for 

n oertificationV for retirement or dismissal,' and for additional 'Educational 

ERIC: / ; V : ; 



BAKER 



13 



? es^ri^^ Wore cheaply and. in more variety^ Solutions 

are needed to a host of technical problems/ such as hew standards are set, hew 
long . tests .should be, and hw much error we can tolerate in our decisions. 
Finally, we : need to find; better wiys" " to assess several critically iitpprtartt 
competencies currently given too little attention — such; as speaking, writing, 
and thinRing, " 



X 



•V 



if 



9 

ERIC 



25 



^erirari federation ; of Tappers- :j . " Z P : ; i:- ^" . • if. :lr ^ " 7 1 V ' ' * 

Attendance at highly 
perusal of the literature, of mar^ national ;edura^^ 

the average observer to conclude t^t teachers, ^ ^(^t^n;;;c^^^^ Viri 
general , ' are highly suspicious of the modern t6sting : industry : . and . even hostile 
to the administration of standardized tests; It is unfortunate that this. public 
relations-style warf are has eclip^d any examination of the ; real heeds and 
concerns of those^who tee tests. Even more, distr eisising i^s thi^ debate! S; effect 
in creating an . e^ospher e .which, questions the value of ccmifWative standards, 
and v/hich undermines whatever, resources and inclination we might have to improve 
upon the-^ we have* . ; . : y .••^ ; .'IV'- f 

WhHe many groups and; organizations are expending vast amounts oft energy 
attaching ' the testing industry f most teachers in i classrooms have ' inf ormatim; 
needs related tQ test results that are; now, for -the most ; part, going r yutnet. 
Their basic c^ mean they are uncrit icall But they 

ate hot d^rn^dir^ that all student evaluations be subjectively ;. a^ 
teachers who have* developed the tests themselves/ They are not • calling for a 
moratorium on tihe use of all stancfeirdized tests. And they .are : not hostile to 
the. use of minimum coripetency tests. : . , : ^ : ,V.^.r ; ^ .'. : V-.V. 

From what we . in the Mericah : Federation: of teacher s r AFL-CIO have been able, 
to find out thjrough our* own ^survey work and examining the work ' of others, 
, teacher 3 £ eel a need to know nipre atout their students; They want to pnder stand 
students 1 individual e^cational -needs better f . and they want . to be able 7 to 
compare' their progress with other students. . They, tot to use tHis^ information 
to improve upon what . they do in the ; classroom. . And, . they believe th^t; 
standardized tests are useful in providing them with some of ■ this information. . 
Th^y want more, not less, inioniiatipn firon tests, " and th^; wuld^like -more dn 
the way- of inservice training to help them in ir& and : using tests 

results. In shorty . teachers are looking for : more actionr oriented : information 
about their students befeause .th^ want to be more effective teachers; .. " 

In discussing how to get from where we are. to where we shoixLd be, it -makes 
sense to begin with a discussion of how teachers themselves ' view, : ..tjae .current 
situation. We can then begin to'^analyze the gap between* What is wanted and the 
adequacy of what is available. Finally f it should- be possible to ..speculate on 
what needs to be done to improve things . . 

HOW TEACHERS" THINK ABOOT:» SO!RNDMDiZED TESTS 

This discussion will , rely " on two' stidies of teacher % view& -of the uses of 
standardised tests^. The i irst is an as yet impublished survey , done for- the". 




ERIC 



.* KEMBLE A ' ? : •' 2 ' : ;'.' V - - \ ; " : /-' 

- Mericah F^dera^on of Tteachers fay the Center for Study of Evaluation at the 

• .-' ^^tJniversit^ of ^lifprnia in LdslAhgeleg r^pDrid, is a stu^- titled "•'Test; ■. 

• Use in, Schools" by; Jennie v^h^Hsb of the Center for the 5t^ v of 

Evaluation. The results of both are_ supprtive of brie another fever* thdugh 
."saftpiitig arid methodology wer^ quite different i / 

r :*R^Jfae AFT Needs Assessment. Survey wa^ sent to a stratified, random sample of 
-800 AE*T members. The return rate was ' 19%, or a total of 153 questionnaires. ! 
ftie return was divided rouc^ily evenly between elementary and secondary school 
, teaqhers. Most were from urban and suburban, communities. While the return rate 
is low, the fact that the conclusions of the AFT survey are reinforced ]3y the v 
results of the work by Beck and Stetz would seem to indicate .tha the AFT Work' r 
- has validity. -. - " - - v • - , ' ' ' ' • 

It must be made clear that the APT survey measured teachers perceptions of - 
v their own' capabilities and needs.. ~When respondents claim to knw 
\'-- rr X\^\^pi^CKm_ are best fed; by information from aptitude tests and which from • 
: achievement tests f there is no way of knowing exactly what thfeir assumptions and 

- knowledge really is. It is perceptions, not absolutes that we are looking at.v " 
■ it is also safe to assume that those, who answered the survey questionnaire were 

probably thosie who felt most self-assured about their own abilities and 
opinions. • . . , • - - " 

The most irSeresting ^sp§ct of the AET survey 1 s cqhclusions f or purposes of 
this discussion has to do with teachers use of test results. Student placement 
or grouping and diagnosis of individual student needs were the uses tanked : 
highest among respondents. Secondary school teachers tended to rely less on 
this information, for these decisions than did primary school teachers. Those 
teachers with' no formal training in tests and measurement gave more weight to 
test results in making these decisions than thbsfe v;ith greater expertise. 

Ironically, despite the usage of these tests for these purposes, teachers 
also complain that standardized tests do not provide enough information in areas 
that would seem to be directly related. For example, 64% of the respondents ' 
'said that "results dp not provide prescriptive information, e.g., guidance as to 
what materials, instructional activities are needed. " Over [ half (54%) found 
that "results do not provide an adequate profile of student strengths and 
weaknesses." These two were among the top problems related to test usefulness' 
for teachers. 

These results would seem to indicate that teachers recognize the 

shortcoming of standardized tests when it comes to making decisions they simply 
must make. Lacking other information, they use these tests • anyway. This 
; dilenfria is further indicated by the fact that 64% criticize standardized tests ' 
for being -inadequate when it comes to instructional planning. Arid yet, 31% 
percent of the respondents reported high expertise in doing precisely this with 
test results, a larger percentage than for any other test-related activity. 

jlhe APT survey also asked teachers what a perfect test would provide. 
Their answers are strongly supportive of what the other survey results relate. 
TEACHERS MEED MORE INFORMATION THAT WILL HELP THEM KNOW INDIVIDUAL PUHLS 
BETTER. 

* ■ V > 

: ■ One caution must be introduced at this point. The fact that teachers 

recognize the shortcomings of standardized tests fbr purposes of instructional / 



ERIC 




. Planning, and yet at the same time use these tests for precisely that purpose, 
: need not cause us to conclude that standardized test use should stop: It may be 
that we should develop new tests' to satisfy teacher needs more precisely. : But, 
such advocacy says nothing negative about Ithe r relevancy of 'i sSndardize^^: tests 
for OTHER purposes relating to group and student comparisons, (it would be a 
m istake . to translate ; a teacher demand for more test information, specifically 
geared to decisions about individual students into a call for throwing out 
standardized tests. The classroom uses of assessment and the broad policy uses 
of assessment are different and' should not be confused. The fact that teachers 
need more test • information for classroom use. does not make the information 
necessary for policy uses — information more likely to be gleaned from 
standardized tests — irrelevant or invalid. ) "." ; . . 1 "■ 

The study by Jennie Yen covered a sample of teachers in nineteen California 

* elementary schools. 260 teachers returned useable questionnaires, a return rate 
of about 60% . One of the main findings of the study is that, while ^teachers 
often use standardized test results for placement and grouping of students at 
the beginning of the school year, they- .seldom use test scores to guide 
instruction throughout the year. Instead^'* they rely on other sources of 
information. (See Table 6, below.) According to Yeh, 

Teachers reported thdt of several possible sources pf infomajtion, 
■ A- ■ rth^^t:^ frequently, used information; from interactions with or 
ctoservatidns of students, informal assessment techniques (e.g., 'oral 
quizzes, reading aloud) or. results f rem teacher-developed tests to 
assess their students throughout the year, The least frequently; used 
sources of informatibh were the results from standardized arid ^ 
instructional program or curriculum embedded tests, while moderate use : 
-was made, of information about students 1 place in a book and work 
: assignments. (1) 

About 53% of t the teachers who responded to the questionnaire reported that 
they developed their own in-class tests/ According to Yeh, ? teachers who : 
developed their owp tests reported that the most important reason for doing so 
was that their own tests "more accurately assess the effects of their 
instruction; In other words, their own tests were seen as content valid. 11 (1) 
Teachers also reported that the format and wording of their own tests seemed 
more suitable for, students. (see Table 7.) 

. There are5 ; ^bme rather simple conclusions that derive from the data presented ♦ 

♦ here. First ;;bf all, it is clear that s teachers want and need, assessment 
information. It . seems, ■ however, that most of what they are getting from 
standardized tests is not as useful to their decision-making needs as it could 
be;/ They realize this, but they often use the information anyway, 'an outcome 1 
that could be counterproductive.. This would seem to indicate that more needs to , 
be done to help teachers differentiate between test types and their valid uses; ' ' 
It also means, and this is the most important conclusion, that *WTXE TESTS NEED 
TO BE DEVELOPED 'TO SPECTFICALLY HELP TEACHERS WITH nteUCTIONAL ' DECISIONS. 

MEETING TEACHER ' NEEDS ~ NO SIMPLE SOLUTIONS ' ... 1 ' 

This discussion thus far might tend to lead some to think that one logical 
conclusion to our problem is to use more criteriorj-ref erenced tests and^ fewer 
standardized, norm- referenced tests. Unfortunately the discussion* about these • 
two types of tests has become narrow and over simplified. Conventional wisdom in . 



ERLC 



the current debate over testing _is that criterion- referenced tests _should_ be 
used_ rather L;tiian.._n0^refeenc^. tests in order to di scour age_ L comparisons 
between diildren, : tq : ^ of test results by^y teachers - and, to^ avoid 

cpmrncai abuses iti this teiease of standardized test data; We need criteridn- 
referenced _tests, to be surer to assist teachers in* diagnosing student needs r 
judgihg sttdent . progress arid individual needs and prescribing classroom remedies 
— the very kinds of uses teachers are now, often wrongly, making of ; 
standardized norm-referenced test inf prmatibh* 

But we ^ also heed standards — and setting standards often involves making 
comparisons ambng children^ ^ HWr af ter_^l > e^i wer set an appropriate mastery 

-Jiggel for a ^ldte ^ 

average child can do? And, how can we get a sense of "average" without a 
certain amount .of standardization? In other words, is it really possible to 
develop a fair criterionrreferenced test without administering the test 1 to 
representative samples ef children and examining their performance? ; 

While teachers may not be as immediately involved in these processes and as 
ijmiediately appreciative of their value as they are of$ otfier activities 
asjjciated with test construction, these processes are no; less essential^ to a 
comprehensive, quality testing program* In other words, an emphasis on the 
kinds of demands coupled v;ith usage that bur studies turn up should not be read 
to mean that standards anc3 comparisons are irrelevant to teachers and schools. 

Teacher needs go beyqnd even these types of tests. In the surveys 
discussed here, teachers felt a need to u se standardized testing data to measure 
educational •* growth ' or "judge student pftogreSs" (see Table -2). Unfortunately, 
oBe of the acknowledged problems of norm- referenced, standardized a^ievement 
tests is that in addition to not telling us much about what the individual child 
knows, they also cannot tell us much about how he is progressing. * But we : need 
to look at children ; over time if we are to get an accurate picture of the 
effects of schooling. Teachers need this information for their work. We also 
need it so that we can knpv^ more about effective schooling. Longitudinal 
studies that follow the same children for a number of years are remarkably 
absent in the literature of research. Bie development arid use of more 
criterion- referenced tests should help us with this problem as well. 

But this is not enough either. To really satisfy the needs of teachers — 
to really get a well-rounded picture of students — we need more varied forms of 
assessment Sheldon H. White takes up this problem in "social Implications of 
I.Q. " an essay in the compendium published by the National /Elementary School 
Principals THE OF MEASURfiBILITY. White 1 s essential argument is for an, 
expansion in the types obtests we use to measure more accurately the range of 
intellectual diversity: 4 ■ ; ■ ' 

'Our experience - with schooling tells us that children show diverse 
patterns of' giftedness and achievement . This' is true within the simplest 
> : form of elementary school as a place to foster reading, writing and 
mathematics. The similarities and differences among children concerning 
' these skills ^are o^ 
point equivalent scores on a standardized achievement test.... • 

. I believe we must imagine that the reform of intelligence testing can 
best be accomplished by the widespread adoption of plural tests of human 
mental abilities such things as verbal ability, spatial- ability, 



KEMBLE 



5 



0 

ERIC 



' reasoning f nimericai *%iaiiy r idea fluency r me^anical knowledge arid skill, 
and^ s o forth ♦ the inventiveness and use of such a system of 
~ — characterizing differences among children would have considerable social 
benefits. It ; would provide a larger magic circle, ehCOTiE®ssing 
significantly more of the reality one encounters in schools. It would also 
provide a considerably richer mixture of science in the midst of magic. ~ 

diff ^^hervwords teachers arid other educators need test's and better tests ~ 
qitt erent tt£fts_^ _ toHFi^^wide-variety^f^eds. 

• -. : ■ _ ' ■ _ .' 

mere is one other: rather controversial point that needs to be raised in 
discussing the needs of teachers for test information. It begins with looking 
at what constitutes a good relationship between test use and teaching, if what 
teachers want is more information about individual students, and df we can 
assume/ they want it to assist them in their teaching, we can also assume that 
the existence of tests that provide this information will influence how teachers 
tea* and what they ±each. In other words, they* may end up teaching to the 
test, a thought which provokes great distress among educators generally. The 
notic» _that teaching_ to_ the^test is a fc^d idea is part of the contemporary 
mythology surrounding tests that deserves further examination. 

'.. N ... 

" _ In a very clever essay called "There Ou^it to Be a Law", Norman Frederiksen 
of the Bducationall Testing ,/ Service takes a close look at this issue. 
Frederiksen tells a story of how a shift from paper and pepcil multiple choice 
tests to tests that required students to perform tasks related to the operation 
of- naval guns ultimately changed the way teaching was done in navy service 
schools. He notes that the change came about not because of any effort that was 
made to change the curriculum or teacher behavior. Improved student achievement 
arid changes in teaching style were the di rect arid simple result of a change in 
the tests used. Frederiksen concluded: 

The moral is clear: It is^ossible to_influence teaching and learning 
by chaining the tests of achievement. It 1 is also clear that those who make 
tihe_tests ha^e a great responsibility to produce tests that influence 
teachers to teach, and- students to learn, the knowledge and skills that 
truly reflect . w . objectives ... : \..~ ^ ^ 

Frederiksen goes on _ to . discuss t his own effort to develop such tests — 
tests that would seem to address the heeds Sheldon White refers to, as well as 
the heeds of the teachers who have answered the surveys discussed here. His 
tests are aimed at f inding out about the psychological processes , involved in 
problent-solving. ftiei r titles are such things as "Formulating Hypotheses*" 
"Evaluating Proposals," , "Solving Methodological Problems," and "Measuring 
Constructs." Frederiksen 1 s thinking and his work have led him to redefine 
tests; "A test is any standardized procedure for eliciting the kind of behavior . 
we want; to observe and measure* I mean the behavior we really want to measure, 
hot merely something related to it." Actually, this definition of tests has 
been implicit in the ways teachers have used tests up until now. The problem 
has been that the tests have been inadequate to the task. 
* ■ . . . •** ■ .1 

. _ But if we really had the range of tests we heeded — tests to measure a 
wide variety of behaviors and ;bhte l^rhing . skills these behaviors_ demonstrate^ 
tests could help us refine TCiencf of teaching ,in imumerable ^waysi *" Jn 
fact* Frederiksen ends up with a revolution^^ conclusion: 



'•i^i-^j h^e^.,atj^^-" ; "\tMt' it # is possible to^make tests that reflect 
* instructional bb j ectives more accurately than do converiticml tests 

ways that enhance learning. If' I am correct, it would seem sensible 
to use tests for teaching, not just for evaluation. Forms of a test 
could be constructed in such numbers and variety that they could be 
used regularly for, homework or classroom drill. Students isould cram 
and teachers could, coach as much as they pleased. The cost of the 
teste would be justified by their value for instructional purposes. 

If those who are how attacking te^s co^d devote just, a little attention 
to developing new tests and to helping teachers use both the new and the ojd 
more appropriately, education would gain much more than it is getting f rom the 
onslaught against standardized 1 testing. 



NOTES 

(1) . Jennie P. Yeh, "Test Use in Schools," Center for, the Study of Evaluation, 
University of California, Los Angeles, June, 1978, page 28. 

(2) . Yeh, page 32. 



31 



ERIC 



j ■ . : \ 



KEMBDE 



■. 7 



•• -Use of Standardized Test Results^ in Instructional Planning 



>.Vft 



(4 -= very important) 

. Mean: . .:- 



Student 

Diagnosis of Individual Nfeeds 
Determining class needs 
Judging student progress :: - 
Modification of your course content 
Evaluation of your instructional program 



2.7 
2.8 
,2.5 
2.5 
2.3 
2.4 



, 5.D. 



' 1.12 

i;ii 

1.05 
1.04 
0.96. 
1.04 • 



9 

ERLC 



Kemble 



8 



Use o: 



-TM3LE 2 , 
dardized Stest 'Results in Instructional - pi arming 
* by Grade Level - - - ; . ; v 
rt^nce ( 4 = very important j , 



Primary & No Grade 



Primary * Secondary Secondary - Stated 
;.. (n=67j (n=S8) (n=12> (n»5ib 

Mean S.b. t Mean S . D. ^ Mean S . D. Mean S * D. 



Student - placement/ 
groupings ■■ m 

; Diagnosis of ; 



- ■■• Ejetermining class -y 
* , L 4 'needs - ^ ' 

' Judging student 

: progress^- 4 

Modification 6f 
• course content r 

Evaluation of your" 

program 



2.84 -0.96 2.36 1.22 3.17 1.03 3.0 

2.82 1.13 * 2.66 1.13 3.08 1*00 3.0 

■•• - •'• ' . ' .- • • ■ •■ v. : 

2.59 0.99 2.47 1.16 ' 2.42 1.08 2.60 

•2*67 ' 0.96 2,VC 1.11 2.17 1.03 . 3.40 



2.45 0.96 2.16. 1.00 2.42 0.90 2.60 



2.51^1.05 . ;2.16 1.06 2.67 0.89 . 2.80 



1.41 

1.00 

0.55 

0.55 

0.54, 

0.84 



>». ■ •: v- i'. ■ ■ 



9 

ERIC 



33 



.KEfBLE-; V . _ [ : .. . ' 

'2;:,..,,,-. : L' • ' ■ 3 ' . ; * 

* Influence of Rirmal Waning irfc^tsjind 
^:;.:; : ;,^W.I^^^.cH:'0'jK£df Standardized Test/ ; 
; Restati" in Ihstrueticral banning ^ v > 



ifo _ Formal : ; College iOSllSge 
Training ^utseg <&urse£ arid 
esay • inservite 



- _ -Training 



- (n=14) . 



Student placement/ 

gr^^ 

Diagnosis of ' _ 
* individual reeds • 

Determining class . 
needs 

■ _ . - -_ <■ 
Judging student 
progress 

Edification of your 
- course content" •• 

Evaluation of your 
"instructional 
program 



pltmm) - 2.6 (1.17) 2.6 ^.(1.11) 



3^0, (1.04) v 2.9 (1.13) 2.7 (1.11) 



2.7 (1.14) 2.6 (1.Q5) 2.5 (1.04) 



.2.4 (1.16) 2.4 (1.08) 2.6 (0.99) 



-2.7 (0.91) 2.3 (0.99) 2.3 * (0.95) 



2.7 -(1.27), 2.3 (1.07) 2.4 (0.96) 



9 

ERIC 



mmLB . 10 

Table 4 

Main Problems which Inhibit the. Usefulness of 
Standardized Tests for Tfeachers 



/ 



- .... , • * , n ._ % 

h) Results do not provide prescriptive information, 98 64 
e.g. / guidance as to what materials, instructional 



activities are needed* V 

q) Resets do^hot^^rdvi^ an adequate profile of 8.3 54 
student strengths and weaknesses. 

i) Results are returned too late to be useful, or 83 54 
are not returned to teachers* 

c) Tfcst content does not match nty curriculum. 73 48 



d) Testjnaterials are ira^r^riateAahd/br biased 71 46 
: for at least some of ny students. 

e) Comparison groups (norms) provided by the tests 47 31 
are not meaningful. ^ 

j) Results are not repdrted in a form that 46 31 

facilitates interpretation. ] \ 

k) Results do npt give me ai^ hew information 46 31 

• about my students. 

a) Tfests are given at the wrong time of year, ft 41 27 
better time would b e " ■ ; — • 

b) Tests take too long to administer. • \^ 32 21 

f) Technical quality of tests is inadequate." 28 18 



ERIC 



KEMBLE 



ii 



Table 5 



/teachers ft Per^gtipns of Inf omaticft a Perfect 
Test Would Provide 4 • 



Prescriptive information 
■jjp for each papil 

Student's strengths and 

weaknesses . 

Rea soning powers 

^aT£flyz ing^ ^rablrenr"- - 

solving r etcl) 
Mastery of skills 
Engl isK/Language (grammar, 



-40 



-32 



Self and environmental 

awareness _ - 3 

Teacher involvement heeded - 3 
Ab^^ to learn - 2 

Scores should reach teacher - 2 

detention abilities — -■ — — ir 

rl 



- 1 

- 1 

- 1 



Reading 

Grouping, students by 

scores 
Math , _. ._. w 
Writing skills ^ 
Comprehension* abil ity 
Does hot exist 
Provide information on 

curriculum taught 
Itersonality - emotional / 

(maturity) 
Socioeconomic background 
Potential ability 
Verbal skills - ability 

to comnunicate _ 
Strong and weak learning 

channels (i.e. r visual 

vs. auditory) j 
Scores in relation tp J 

other areas, districts 
Artistic ability and 



-19 
-12 

-11 

-10 

- S 
-7 

- 5 

- 5 

-'5 

- 4 

- 5 

- 2 

- 4 

= 4 



Overall factual kriwledge 
Motivation- - 
interests 
Social knowledge 



- 3 
-3 

- 3 

- 3 

- 3 



Why a student does or does 

not want to learn 
Standardized testing is Big 
. Business 1 profits . 
Concentration level * - - 
Effera^eoess of teacher's . 

instruction __ - 

tests must te more concrete z 
Tests should not confuse 

students 
Should provide unbiased 

results 
Do the children have _ 

emotional learning blocks - 
- Leader ship ^Dilities _■_ r 
Learning growth (pre/ and 

post tests) 
Ability for later employment- 
Chart - graph - map 

interpretation abilities - 



1 

1 



- 1 



- 1 




er|c 



36 



^KF_BLE 



12 > ' 

Percents . of Kacher s FSkiiig 
Standardized AchievOTaifc Test 
Classrooms 



Uses of .,: : 
infTfiexr 



Personally use 
s tandardized 



achievement 

4*___i04*e* y*Af*tx1 +"£_* 

tescs resuxxs 

for: s 


* 

rotai 
sample 


Grade 
Group 1 ; 
1 


is qs 

GrbU] 

______ 0 


nbined 
p Group 

. -3 \ 


• Groups Combined ' 
Grade^Grade Grade 
K - 4 5 - 8 9 -12 


Percent 
of Omits* 


Individual student 


















evaluation . 


65 


63 \ 


60 


80 


65 - 


„' 68 


55 - 


7-11 


Diagnpsing strengths. 


















& weaknesses 


74 


74' 3 . 


7., 


84 


77 


76 


63 


6-9 


Class evaluation 


45 


44 '/ 


40 


- 


49 


45 


30 


13-20 ' 


Instructional planning 


52 


51 


51 


' ,58 ,J 


52 


. 56 


42 


10-16 


Evaluation of teaching 


















methods 


37 


36 v 


36 


44 , 


40 


37 


29 


15-20 , 


Reporting to parents 


42 


41 


40 


.54 


"44, 


46 


.'28 


13-20 


Reporting to students 


24 


22 


24 


33 


15 


34 


29 


17-22 


Measuring "growth" 


66 


67 


61 


77 


71 


, 66 


43 


8-18 



^Percent of teachers in the various 

• :* 1 



es who omitted this question. 



o 

ERIC 



37 



Per cent s of Tteachers Who Consider Standardized 
Achievement Test' Results Useful for Various 
Purposes '•"•v 



-Standardized test results Total ' Grades, Combi neid • Groups . Combined 

are useful to: - Sample Group 1 Group 2 Gfcoup 3 Grades ' 

: ■ . Li: : : i ■ " . ■ ■• : - ■ - : ■; k-4 5-8 9-12' 



report to newspapers 


10 ' 


10 


10 


11 


• n 


11 


16 , 


report to boards 

of ecluraticn 

• *■> • 


52 


. 53 ' ■■■ 


51 


'-54 


46 


56 


62 


report to parents: 


67 : 


DO 


Off 


7ft 




70 

;/ Vl 


* 70 


report progress to 
students 


pD 


cc 


56 
Jw 


S3 




66 


■ * 

71 


' * ■ - ■' 

measure educational status • 

t wrliitT rill i -< | f 

of individuals r 


OX 


Ol 


SO 

VI v# 


67 


58 


64 


65 


measure educational "growth" 
of individuals 


77 




nri 


Oj 


.77 


7R 


/VI 


screen special education * 
students 
















DO 






67 

VI / 


51 


59 


65 


help plan -Instruction for .. 
individuals 

*' 


63 


62 


61 


70 


7 

61 


68 


59 


help plan instruction for 
class groups 




fit; 


61 


72 


65 


67' 


57 


detect system-wide general 
strengtns/weaKnesses 




7£ 


/ £ 




73 


77. 


79 


nelp evaluate ceacning 
procedures or methods 


34 


34 


32 


44 


36 


35 


30 


f - 

neip eyaiuaue lnbujLuuuiuiiax 

materials - 

* 




39 

* 


43 


46 


41 


42 ; 


39 


help evaluate teacher 
performance 


21 




17 


30 

*> 




23 


19 


compare students with a 
national peter groups 


58 


60 


53 • 


63 


54 


59 


69 C 


compare classes in a school 


30 


28 


, 29 . . 


36 


-26 ' 


32 


36 


compare schools within a ~ 
system . , : ..■'/-/ 


36 


33 


37 


49 


33* 


38 


. * 

41 


eoTipare a. systati with systems 
across the country 


56 ]-:,. 


: " 58 


54 ; " 


.59 


52 


58. 


65 ' 



d *Across questions and sub-groups, 5-12% of the teachers omitted particular 

ERIC" ' question. ; ■ - ^ •; • % qq :''./ v : . 



fftRT II 



ASSESSMENT AND INSTRUCTION: WHAT MIGHT BE 



irnPart I, we~n ai"scussed some of _ the main assumptions und^Iying 
conventional educational tests, and we examined the role these tests have played 
in the instructional process. As we argued in Bart I, conventional tests have 
: not provided much /information helpful to teachers in the practice of teaching. 
ftie; reason_ for the l^ted instructional_ ^v^ conventional tests* we 

believe r: . is that the tests arevbased on a mistaken view of the relationship 
between assessment, teaching, and learning, - - 

To unferstend the role of assessment in instruction, we believe it is 
necessary to begin by focusing on the ways in which teachers, in the day to <3ay 
practice of teaching, form judgments about what their students have --learned. 
Teaching, we argue, is an ongoing process of inqui ry , in which teachers 
continuously draw inferences about what is going on in the minds' of their 
students. • r * •'■ \ ■">','■ 



v Conventional testing is generally conceived /as something which either 
precedes or follows instruction — not as» something which has instructional 
value in itself. But if the view we have taken is correct, assessment materials 
should be conceived as ways of expanding on the inqui ry process already inherent 
in teaming. From this ^r^rti^,_QTtere should be little distinction betwert 
instructional materials and assessment materials. 

: _V___ _ __ ' . ■_. • . ; ' y . _ f •'• j- J 

We develop this view of the role of assessment in instruction in some 
detail in the two Chapters that follow. Hiilip Jackson, Professor of Education 
at the University pf Chicago, . examines .the routine methods teachers rely on to 
make judgments about what their children know Jackson identifies four coittricHi 
approaches, teachers employ to draw concisions about stuSents^ JEbtaght 
processes, ranging from informal observation/to formal questioning and testing, 

' . _. . : ; . _ - ; 

Each of* these four ways of ccMng to understand students % thought, 
Jackson argues, is fallible, and taken together, the four methods cannot 
eliminate entirely the fundamental uncertainties involved in making judgments 
about students* cognitive, skills. Furthermore, Jackson concludes, the act of 
using formal questions to test student knwledge can at times be ^ disruptive of 
the teaching and learning process. Asking students continuously to demonstrate, 
what- they know can betray a lack of trust in student % s autonomous capacity to 
learn. . - ■ • • ' , \ 

; v We believe, then, that assessment materials ' for the purpose pf 
instruction should not be viewed as :' something tp be employed once insruction is 
complete. Instead, assessment materials should be viewed as materials much Xike, 
regular classroom exercises or games — but designed' to reveal strengths, 
weaknesses, and appropriate pajbhways through the curriculum, for individual 



_._ In the following chapter, 'David Hawkins explores some of the 
implications of this view of assessment, teaching, and J learning. He /begins by 



• arguing that learning can be misrepresented in. two seemingly opposing ways — as 
a process of transmission or sniping, and as' a process * of autonomous 
aevpojwentf; TO-understand the roil, of assessment in ahsferabtibh*^ he goes on* 
both views. need to be- combined. ; V 

. Haykins argues that children are active model builders. They learn in 
the process of completing _gameSf.'pu22les f -and tasks. Learning is an activity in 
Which the learner abstracts information__ f ran the world by selectively 
— interacting with it, and many pathways are generally possible. 

, At the . same time, learning depends on teacher guidance and direction. 

By focusing. student attention on particular elements of a task, a teachers can 
increase the likelihood that the task will elicit critical skills and 
capabilities. By raising questions, a teacher can uncover hidden connections 
and deepen the quality of student discoveries. By assessing student interests, 
strengths, and weaknesses, a teacher can 'select appropriate curriculum materials 
and tasks. . . •• 

_ Thus, Hawkins concludes, teachins is a dual process. The art of 
teaching involves both devising a curriculum and helping students find pathways 
through it. It involves both laying out tasks for students to complete and 
asking students to reflect on how they completed them. Assessment materials and 
instructional materials, then, are essentially similar. Tasks that encourage ' 
learning also provide information about the learning that has occurred. 



9 

ERIC 



JACKSON 



i 



THE UNCERTAINTIES OF TEACHING 



Riilip i W. Jackson 
University of Chicago 



"A teacher affects eternity, " Henry Adams ogee, wrote r ; "he never can tell 
where. his influence stops." That celebrated quotation, a mere twenty syllables 
A in all r must surely come close to being the perfect tribute to the teaching 
profession. For what nobler thought could there be than the one expressed in 
its first, four words and what truer fact than that contained in the ^remaining. 
,. : eight? "A teacher affects eternity; he nfever can tell where' his influence 
stops. " Inspirational, accurate, concise: A combination hard to beat. Snail 
wonder, then, that Adam's verbal pat on the back, penned more than seventy years 
ago r retaines its appeal to this day. : ; \- 

Yptt however fine those twaLve Well--ch69en words may be/for chiseling in 
granite over the portals of schools or on the headstones of dear departed 
teachers, they leave much tabe ; desired when re|d:as commentary on the really 
troublesome uncertainties effected with the a£t of teaching. Adams never meant' 
them to be read that way, of .course. He obviously was more intent on paying 
respect to teachers than - oh being ^either descriptive or analytic about ~ the 
details of their work. _ But questions about the more mundane and worrisome 
aspects of the ignorance from which teachers sometimes suffer are not long in 
surfacing once we have been stimulated to think about the more flattering forms 
of the unknowns they confront. ' '< / t ) \ „• ; ( # 

.. The mental process that' guides!' our thilikirig about such matters, seems to 
work a bit like gravity, at least it does ^f or me. Just the way most things 
hurled into the air are p^led. back ; t:o earth, so do my thoughts return to the 
here and now after a skyward leap-'of the imagination. And the more COTinonplace 
the topic, the faster, it seems ,V is ^the; return. Teaching, bisiHg quite an 
ordinary activity, does not allow' my ri^niriatiohs to soar upward Jpr jlong;L-. after 
only a few seconds of wondering jaiiiout the farthest rea^j^f > a teacher \s, v 
influence I find myself asking questions like;: What about' tfie minute-b^minut# 
influence teachers have on the j^pH their very eyes, <ah arm's 

length or so away? How muph do tiey know about that? -y- f y 

' ; ' ■ "{ ' • ';• . ■ , • -4 ' ' ' . _ . 

"Much less, sometimes, than they'lpuld like to know^' has got to be the only 
proper answer to such questions. ;|)r wf#jteacher has :^ot wondered from time to : 
time whether - this or that stu^^ 

whether the class., as a -wh^i^Eg fpjiwing the laj^ ;bf an argument or had > 
grasped . the moral ' of a tale?^wid wHo'^toong us has had : all such questions 
answered to his or her satisf ac±rc|i^ ^ §i|feiy the answer is: y- none. 

So if we think of a per^i^^ as extending forward in time and 

space, as Adams 1 s observation cpitpei^s, It is not simply , that the teacher cannot 
tell where his .sfcQES. In ^11 :Jifceliheb<i his; also cannot tell for ^sure where it 
.StatiS, and from time to^ime ^ may have serious misgivings about how it is 
progressing between start "an^ forms of uncertainty 

can be quite unsettling,? f abmore so as, 4 f ul e : : than ; mght amy speculation about 



JACKSON . 'V : '2 . ' '' ■/ '-' 

•i ■ ' . . " _ ' ' . _ _ . _ ^ _ _ ' j_ ■_. -j_ _ J ._ 1 _ _^'_L _ _ _ '■ " 

long term influence,- for they bear directly uppn such matters^ 
- ^y-torday sense of accompl ishment and the public's confidence in the work of 
the sdiools* 

^fts a teacher I may never live to discover that what T £aid one day in class 
has altered ithe course of human history a mite, and it is a pity that such good ; 
news is unlikely to reach me. But if , I go home at the end of each day with 
serious doubts about whether, anything I did or said had any effect whatsoever on ; 
anyone r I've got serious troubles, no matter what my future: rewards might turn 
but to be. , The public too might thank me and my teaching colleagues some day as 
it canes to realize what a powerful force for the good we . have been; But if 
tomorrow it begins to suspect . that our students are not' learning what they are 
supposed to iearn in our classes, the status of the entire teaching enterprise 
is in jeopardy* > - . - 

The possibility of such deeply troublesome uncertainties arising among 
teachers or within the publibvat large does rot make them a certainty of course. 
-;-} Indeed, they may never arise at all* J No one, certainly, would wish them to. 
But the fact that -wis can even imagine them occurring and San do so with ease 
says something about teaching ^ that we would do well to ponder, < particularly* "if" 
we are keen on preventing such ianpleasanfe possibilities from happening^ v 

A part of what it says has been sfcated implicitly already a^d is simple 
enough to be almost Self-evident. It is Qiat teachers may - sometime^ have a hard - 
time proving their worth, even" to themselves. Why this should- lae £o is also 
easy to under stand r deriving as it does from the -obvious fact that* teaching, ; , 
unlike masonry or brain surgery or auto mechanics or even garbage collecting; 
has no visible predict, no TOncrete_;p^sical object |o make or repair or call 
its own. Consequently, unl Ike worker sin the_f or enained_ oc^^tions - and in the 
scores of others that could be added to such a list, when a teacher's work 'is . 
; finished he or, she is without anything tangible to hold up as the fruit of his 
or her labor. NO sturdy brick wall, no tumor-free brSin, no 'smdothly purring 
engine, hot even .a clean back alley to point to with pride as evidence of a job 
well done. 

Indeed, the _very question ^of when the_ teacher 1 ^ 
well qj^ poorly, is itself problematic much of toe ti^e and 

by agreeing in advance upon some fairly arbitrary cutoff point, a time to call 
it quits, such as a specified date on the calendar or a set: number of 
instructional sessions. Moreover, ^what is true of the termination of 
instruction is equally true of resting points along the way. Even the decision ; 
to end a single lesson is more often determined by what the clock on the wall 
says than by any judgment of pedagogical accompl ishment . 

In this f ra|ure of their .work* Sis absence- of a tehgible prbduct whose 
gradual ^ transfonriation * yields a cle^rcut criterion_ of progress, teachers 
obviously are not alone. They are joined in this regard by; ministers, priests, r 
rabbis,' therapists, performing artists, ambassadors of good will of all 
.; varieties — from office receptionists to public* relation specialists — and 
' countless other worker s whose chief concern is with how some special group of 
people think and feel about things. At the close of the day, figuratively 
speaking, all these good people, teachers numerically prominent among then, wind 
up empty-handed. « ; 

- ; Nbr can it be said that teachers suffer mo^ 



ERIC 



JACKSON 



3 



others who face if* There is no irisasbh to believe that the ^ches of 
pedagogues are any more or less sensitive tb : discomfort than are Bhose of their 
fellow mortals who face a ^similar |ftight. 6onsequently f we might expect self- 
doubt and other fprms of personal misgivings to plague teachers no more than 
anyone else whose labors yield little in the way of visible proof of 
acataplishment. * : . , . 

At ; the same . timer granted that the general condition of periodic 
uncertainty occasioned' by the absence of a tangible "product" is widely shared 
by many occupations and, in all likelihood f is equally troublesome" to each * it 
is also highly likely_ttat_ead| orcupation so burdened experiences and copes 
with_this state of affairs som^hat differently.^ this to be if 

for no other reason than that the overall circumstances of each form of work — 
its mission, techniques f physical setting, and so forth — are sufficiently 
unique to set it apart f rati others. Why not r then r the uncertainties each face? 

Perhaps these too are uniquely defined for each occupation. Ah exploration of 
that possibility sets the- 'agenda for what follows f which is to consider in some 
detail hew the uncertaintes bf teaching are commonly and perhaps uniquely 
thought about aod dealt witiu ^§n^such^a ^^<^e_look is taken what emerges is a 
via/ of teaching that is at once familiar and strange. . 

• .;/:.".•. .'■ ' ., /: • ^ 

What puzzlfes teachers most?* What ' is characteristically problematic for 
w them? How do they^ think about the uncertainties ' they confront? ] There - are many 
wdys of framing the opening question of such an investigation* but none has a> 
def ijnitive answer,, for. the circumstances of teaching and the personal 
i characteristics of ihdiVidual teachers vary enormously and change over time", as 
do the broad features of the prof ession as a_ whole. . Consequently f what is 
puzzling for " one teacher may not be for another and what teachers of today look 
upon as problematic may have been taken for granted or never even examined by 
their predecessors a few generations back. Yet despite these situational, 
personal, and historical variations, there are similarities and continuities as 
well in the way teachers characteristically view their ^ork. With respect to 
the brace of questions used as openers, the answers with the broadest 
applicability across different settings and different times would surely contain 
some ref erehce to twp closely ^lied perspectives on * the- teadier 1 s task. One of 
these is /? philosophical in orientation; the other, psychological. 

EWiloppfiiically speaking, all teachers might be said to be puzzled chiefly 
about epis|emoiogical matters. Biat is, one of the most COTinon ways of talking 
about the goal of teaching is ^to describe it as having to do with knowledge and 
i ts t r ansmi ss i on . Accordingly, when it comes to the question of what worries 
teachers most we might reasonably expect that the answer would have something po 
do with the status _of some specif ic bit of knowledge, _ be it a skilly a 
propositibhal statement, a logical construction, . or what^ have you. And we 
hardly need conduct an empirical investigation to affirm that expectation. 
Anyone either who is or has been a teacher or who has been around teachers for 
any length of time (and the latter category must include almost everyone) 'would 
surely agree that teachers seem to spend a lot* of time worrying about that most 
ancient of all dichotomies: THE KNOWN AND THE UNKNOWN. 

But this recurrent concern, which I have christened with %ie ad jective 
epistemological , could as easily be called psychological as well. Though 
teachers may be accurately described as being principally concerned with the 
status of some body of knowledge, they are not concerned with it; in the same way 
as would be a person studying that knowledge on hisi or her cwn^ nor as someone 



JACKSON 4 

Seeking to add to that knowledge, nor yet as someone, chiefly interested in the 
principles or conditions by which knowledge in general comes to be established 
as might, say, a cognitive psychologist or even someone who called himself a 
professional epistemologist. 

For one v thing, teachers are chiefly interested in the status of other 
people' s knowledge, as compared with their own. But that does not set them 
apart, of course, for there are many people who are interested in a professional 
way . in what others know or do not know. (Public pollsters and spies come 
immediately to mind.) 

What- distinguishes the epistemological puzzlement of teachers, if i may 
stick with such a fancy tag for the worries under discussion, is that it focuses 
oh knowledge that is or is not lodged, so to speak, in the minds of an 
identifiable (and usually a clearly identified) group of people, -called 
students, .and, on knowledge for whose transmittal the teacher is either partially 
or wholly responsible. This means, first, that of all the uncertainties facing 

fa teacher some of the most bothersome take the form of questions about WHAT is 
OING ON AT THIS INSTANT INSIDE THE HEADS OR MINDS OF THE PERSON OR PERSONS 
EING TAUGHT. Do they understand? Are they following me? Has he grasped the 
point? A parallel set of questions fills the pedagogical mind when instruction 
has^peased. Did they understand? Have they now achieved mastery? And so 

• _ ■ " ' 

- I fc means, second, that the teacher 1 s answers to such questions, even his 
guesses as to what the answers might be, have an important bearing not only on 
what his next nove will' be as a teacher,, but also on his notion of how 
successfully he has performed his work. 

This is not to say that no one but a teacher raises questions_about whether 
another person does or does not understand whatever, the quest ioner^&as~ -been 
trying to communicate. Such queries are cxanmonplace in human affairs. They 
occur each time someone says to someone else "Do you understand?" or something 
equivalent. Usually, however, the "messages" whose acknowledged receipt is being 
sought in such exchanges are situationally specific in content and, therefore, 
do not qualify as knowledge that is generalizable to many situations the way the 
contents of a teaaher's lesson purport to do. When what is being communicated 
ifcss have such a generalizable quality, the exchange is "decidely "teacherish" in 
tone, no matter where 'it occurs or whether any of the participants think of 
themselves as either teachers or students; 

Having said this much about the epistemological arid psychological focus of 
a teacher's concerns, we are ready to ask how he. or she typically goes about 
responding to them. What, in other words, does the teachere do to answer the 
mahy questions that crop up during the process of teaching? Once again, we need 
not initiate a tedious empirical investigation to obtain at least rough and 
ready answers to this, our secdnd order, question. • Given the familiarity of 
teaching to%ost of us, all we need do is to "picture >in our mind' s eye a typical 
classroom teacher at work. By so doing, most of us can easily "see" what an 
answer to our question — at least in gross terms — would have .to contain. By 
this easy exercise of our imagination we s can, as it were, envision the major 
ways ^n v/hich actual teachers may be seen'to go about the business of ' finding - 
Out what is going on inside the heads of their students. According . to my own 
count, there are four such strategies. In real life not every teacher may be 
found to use them all, and some teachers (such as those on television) may use 



ERIC 



JACKSON 5 

none at all, but each is common enough to be familiar "to ; most of us.- The first* 
' tnree involve actions that take place while teaching is going on. she fourth 
occurs only after teaching has ceased, has been temporarily halted, or has not 
yet begun. ,»-.■'■ , . . <• , 

The least formal and the least intrusive of these four ways of 
investigating what is happening in classrooms is the. common one of looking 
around the room for signs of the Students having difficulty With what is being' 
taught. This form of visual monitoring is most/Teadily observable when -the 
teacher is delivering a lecture, or conducting a discussion, though it can 
sometimes be seen to occur during the 'supervision of seat-work and study 
• periods. What the teacher is • looking for on such occasions are those 
« spontaneous indicators of understanding and interest or the lack thereof ' that 
can be "read", so to speak, from the looks on students' faces and the postures 
they adopt. These include nods of assent, smiles, frowns, • furrowed brows, head 
\ scratching, fidgeting, droopy eyes, and much else that makes up the "vocabulary" 
Of what is sometimes spoken of these days as ?body language." ; *" . " - - 

A standard way of - talking' about this kind of visual search- is "tsp/say 't96t 
the teacher is trying to find but whether or not the students 'are" wifli ' hljn or * 
whether .they are following him in their under standing . If the* judgment is that 
they are not, they are sometimes spoken of as being Igsfc or su£ ££ it* a 
condition calling for some kind of remedial action. Finally, though -the chief 
purpose of the teacher ' s visual scan may be to seek information about how things 
are going ("things", referring principally to the students' understanding of/the 
' material being taught) , the act itself is often perceived by students to be a 
kind of warning signal, reminding them to remain attentive and alert. Thus, the 
procedure itself helps to bring about the .conditions that are the object of the : 
search. * ; • • 

J fh e second of the *four techniques is not as easily observable as the one 
just_described, though it is hardly less common. Its lesser visibility, derives 
frcm the fact that it has more' to. do' with the establishment bfh a classroom 

I;' Procedure than with any readily identifiable movement or action -on the part of 
the teacher. Basically, the procedure is designed to encourage students to 

r : volunteer information about the status of their unqersjanding-, of the material 
. being. taught. Usually this encouragement takes the form of an .invitation to 
interrupt the teacher or the classroom proceedings whenever v there is a failure 
to jomprehend what is being said or done, though the formality of actually 
invifihg distress .signals of this sort is often unnecessary. Many students 
volunteer the information without being, asked. (Indeed, sometimes the 
interruptions come so thick and fast that the teacher is obliged to- slow them up 
or s£3p them completely, usually by requesting that such questions be held until 
the end of the class or until there occurs a natural break in the session.) In 
essence, then, this strategy* amounts to arranging conditions so that . students 
will call for help when they are in trouble, thus signalling a breakdown in 
comprehension or understanding. 

A third common way of finding out whether Or. fjgt. students understand what 
is being taught is to ask them directly while teaching is ^underway. Such 
questioning takes many forms, most of which can be arranged along a continuum of 
specificity that refers to both the content of the question and the person or 
persons to whom it is addressed. . At one extreme are those queries addressed to 
no one in particular and calling for little more than a nod of the head/or a 
showing of hands. These" are often' one- word questions, such as "Understand?" or . 



2 - ■ ._■ v ,_ ^_ . _V .J* » • J » ■ v'- ' '_ J __ '_ "2 j 

• • H OB? ii }or ^Right?"^ JSqme teachers f iase th^go- habituaily that it is 8 doubtful that 

• - they are even^aware e of doing .so. . .;.'*'• * ; % : ; r * ' V\ ■ 

At the other extreme, * and much/ more interesting f rom the staridjjfcirit of 
'X ; understanding what, school ' is . all about , £re questions to individual stud^ntSf 

• asking them to display their knowledge in some detail or to' pefforft a particular, 
skill for the teacher 1 s inspection, ttiese targeted queries leave no choice but 
to respond in one fashion, or another, thus revealing khowledge or 'ignorance for 

. the teacher and all others present to observe. Indeed, an old-fashioned - way of c 

• . ^ dealing with the answers given was to grade arid record them on -the spot^ * a 
. \ procedure that was part of what used to be called the recitation method, 

Fourth and f iriaHy^ co^ 
at finding out what -students h^e jleasned^ Jte^ 

• school , must by new have guessed, these comprise tests, v quizzes, ] exams> arid 
related ^activities that twi^lly: occur during 'lulls* in t^achirig or after it'has ' 
ceased completely. In addi tion to • the ubiqui tous pdper-and^pencil '' tests they 
include term paper s„ oral examinations, project reports, /recitals, £nd other - 

„ means of o allowing or requiring- ptuderits r to display their !.* newly • "acguired 
; knowledge and skills. ^ Beyond occurring outside of teaching, / 'so to speak/ these 
forms of questioning (for that, in dne, sensed is what > they all are) / Ijave an 
. off icialv quality andean air off inali^ aBo^ tfiOT tiia^, custgiraril^are lagki'ng 
/fn the _i|ss ^ formal m^o^ that^haye been described. Thijs is *sd JbecajKeitheir . 
results portinonly servers the chief, if got < the sole, basis for assignirig course 
grades. . v 

. Here theri,^ if my exercise of % i^gining a topical teacher /in action has 
yielded ah accurate portrayal of reality, are the four most conmon. ways enp.byed 
by teachers to quell whatever uncertainties might arise in their minds about 
what is ha^ehihg or has happened in^the_min§s^of their students^ Itiere _ may be 
other common ways jas well, but none sugge^s itself to me, .^Consequently, j I 
l offer - these., four as the classic procedures by which teachers cope with the 
; ^ unknowns that fee set; them, V /? ' - .• • /■/' ; ,. 

How successful these procedures; turn out to be will depend,/ of course, on 
the skill arid consistency 1 with which each is employed. Some .teachers are 
doubtlessly more skillful than are others in their use arid some teaching. 
sittStibns^lend themselves inor e easily to tSeir ap^ticatibn^ than^. do others. 
Such <3if fir ences aside f howwer , |^ can^ said! xif all | f pur^ that none is 
foolproof and that each has special shortcomings limiting m.ts usefulness/ Some 
of these limitations are widely recognized and under stood ; * other s , seem not to 
be. • 'f '<?y :: '* ' \ ■ 

It is well-known, / for .exarjjjffite,. that the outward £ign& of inner 
^attent iverie ss z arid /understanding can/p faked. Ihus, by looking around Jihe 
dlassroon and relying on visual cues alone the teacher ■ may think that everyone 
is follov/ing the diiscussion br whatever, whereas mai^ fl may ' npt ; be> Conversely, 
the student who appears tb^be dbzingjoff in the far ^corner of/ %e room may 
actmlly be_ the most attentiye of them all. Such are the ambiguities that 
> plague the application of the mast ..effortless of the four methods. 

We know too- that calling for students/to signal their dm difficulties has 
built-in drawbacks. Uiou^i the teacher m^ do ^veryttiri^ fc iri his or her power tbC 
create a rion-thr^ateriing atmi^^ere, brie in which stQdents^feel f ree^to say -what 



ERLC 



^ 46 



v JACKSON ' • ; .-. ■ ■ ' ; 7 r " (■;{'*". S..?. ..' ■'"'»"' 

is oh their Jnips^ and to confess to tro^ riot everyone* 

v - . even in the most comfortable envirgiment, is willing, or able, to take advahtag? 
of such an opportunity'/ Consequently^ _ no matter hew hard the teacher might try 
to ; have, it otherwise t there will always remain the nagging worry that some' ' 
sti^ents are having difficulties in understanding but are not saying so. 

. Turning -'frorp these two more or less passive strategies to the two more 
active ones* : those involving questions the teacher \ puts directly to one or more 

' - 'students, we find the fallibility of the information they provide to be somewhat x - 
different in quality but no less troublesome; * In fact, the Use of these direct 
probes and even the threat of their use introduces into the teaching encounter 
r an element of social' tension and an unusual quality : that serves to set teaching 
apart from other fprms of , human activity. But before examining these more 
subtle features- of the questioning process as it occurs in r classrooms, it is 

, ; . ; weli to take note of some of its irpre obvious limitations. f Only by so doing can 
we begin to understand why formal evaluative procedures, such as tests and 
quizzes, are not more widely used in schools than they are. 

* . . • . ' —— — ■ ■ - • _ _ . " - ' 

Y Those questions the teacher- asks of , the class in general = queries like 
$ "Understand?" or "Is that clear?" — are so obviously open to false answers (or 
, to ; no. answer at all) tlrat' Tittle more need be said about them. It is worth 
noting, heweyer, that signalled comprehension or understanding can be false in 
■ *" two ways. It may be" that ;the student who nods his head when the teacher asks: 
^Uriderstand?" is^ware 'th?t he lacks understanding but wishes to hide that fact 
fran the teacher. But it may also be that he jfchinks he Understands, but truly 
does$ not. dfriqs.the unreliability of the information yielded by this form of 
questioning has tw&-pptential sources. 

I Questions have , content and that are dijgeted at particular students may hot 
-4 leave t£e % teacher ; guessing -whether the qafeiOTed student 7§oes or does not 
* unc^r^and what is_ : ^ 
iri.doObt) but they have drawbacks as well ffi of these is. that an 

unsuccessful or incorrect reply is ccmnorily a. source of embarrassmdit to 'the ' 
; 1 person giving it:. t It can also be a socially disruptive event for tfhe cl^ss as a 
>'■• whole. Consequently, a standard practice among teachers seeking 'to reduce the * 
likelihood of such "wrong" answers is to pose questions to the class as a whole , 
and then ^k volunteers , to answer 'them. This, procedure is. obviously designed 
to avoid the OTmrrassmerit of cl&IJhg o^X^meone whg)>'rt£lSt then confess 
■'•■>: ignorance. But the ploy is by no means foolproof. The degree o£;{jnderestanding 
"signalled by the waving* hands of volunteers can be either more or less than it 
g appears, as every teacher knows. 

- n ,£dded to the threat of embarrassment associated .with direct questions from ^v- 
the teacher while class is in session are economic constraints ,^s well. Such 
questioning obyi^ly- ; stakes time, which cprponly means time ta&en away "from 
direbt iristruction.; Ftorfebver, once"a:Muest:iai : has been asked and answered in -f'P^. 
public its, pedagogical usefulness is spent. (Te^c*ers can and do Mlfc up r -W? 
successful* answers with ..queries ,likfe"How many a^irfee with. Sarah?", but the 
reliability of the ^information received in reply is generally not much greater < 
^iian when the teacher -asks, "Understand?") So in addition to losing up precious 
class time such direct questions have to be employed judiciously for they 
cqginoniy-are npt reuseable. - 

r>. :^ prance that avoids many of ;6he ; 

pitfalls and limitations being discussed is to avoid questions that have correct §• 

, .C- ■ ' -\ : A : C : : 47 i: '"r.; .-.>■ . 

ERIC . " v . ; - - ■ S ,. ,: 0: t i ; > - " ; ; 



JACKSON 8 

or, incorrect answers and [ q^centrate instead on_ eliciting student bpihi^i or 
attitude. This ; tactic obviously eases the social strain arid makes it possible 
for the same question to be addressed to more than one student.] "After aliV H €' 
teacher using this technique might point out, "everyone is entitled, to, his or 
1 her opiriiern " The tpouble^ of courser _ is that not all curricular content lends 
. itself, to : sucfi f a non-threatening sharing of individual viewpoints; indeed, 
: critics of this pedagogical strategy might call it an avoidance of the teacher's 
responsibility for the advancement of his .students 1 knowledge. Exchanging 
opinions might be fun, the criticism might concede, jsut seldom does it promote 
any true intellectual gains. * . - 

Turning from the kind of questioning that goes on whiie class is in session 
to that comprising paper-and-pehcil : tests r term papers, and the like r we face 
many of the same limitations that already have been discussed arid some new ones' 
as , well. Tests, like the directed questions teachers raise in class, /are 
threatening to many students., they are^ costly in time and energy to' construct, . 
administer,, arid score. Because of such costs they almost invariably are limited 
to a sampling of the questions that coul'd be asked or even of the ones the 
teacher would like to^ask, and frequently a very small sampling at that; ' ' •' 

From the standpoint of its usefulness to the teacher himself, . the 

information gathered. thrpflgh; such formal procedures is seldom of a much direct 
value/ for it typically rar'rives tod latcTto be of help to the teacher in" 

. modifying what goes on in the classroom. Assessment procedures that are part of' 
some of the newer schemes for individualizing instruction je;g. , IGE, IPI, etc;) 
may ba exceptions, to this general rule, -but by^and^ large the rule stands; Tests 
are relatively ineffectual means of clearing up whatever uncertainties teachers 
may have about ho* well or how poorly they are doing their job. "Methods of 

t evaluating students that are even further removed from a . direct display 'of. 
knowledge "gained through instruction (such as^tem papers, , pro^cts, and 'the 
like) may provide the teacher with useful information about many aspects of, a 
student 1 s performance, but,, again," they are unlikely to reduce * any of the * 
uncertainty that might exist concerning the effectiveness of the "teacher's own 
actions. • ■ b . ; . - 

v^Here^ * £hen> ; £;£re -several , c^4the^ more obvious drawbacks associ^^ the : 

four most common ways teachers go about the tricky business of trying . to find 
y out if the material they are teaching is getting across to students. The 
purpose of highlighting* the* fallibility and limitation of ' each method % not to 

suggest that teachers should use v any of them less than they do. Rather, it is : 

to . begin to explain ^why some of them, paticulariy :he more . formal and direct 
methods of questioning, are not used jdqi£ frequently than they are. Moreover, 
•with respect-to the latte^ proqg^ures, t**o further considerations need be, as 
to thos¥ f aliread^ mentioned^ Both h|ye to dfo Mth "the sbnyewhat " pecul iar 
of the. questions teachers ask c > 



> 

Normally when answers, they neecK * 

them. That is, they are seeking the information requested, for its own sake. , 
(There are, of course, exceptions to this rule, such as rhetorical questions and 
those "polite" inquiries to which a standard response is usually given — e.g., 
"How are you?") Indeed, in 'everyday affairs ^if^e are given icause to believe 
- that the, person askir^va_que^id^^?ea^ possess ttife informatics being sought 
we would legitimately ^gin to wonder why he or she bothered to ask. Were they- * 
sinpy ; teasing? >i^re th^ mre they peeking a " 

confession?* Whatever v: €heS answer^ we "'would be reasonably , confident that s ^ 




eric :% 



JACKSON 9 • 

something was fishy about such- a state of affairs. \ 

' Consider, however, the condition that obtains when a teacher calls upon a 
. student to s display-a pj£ce of acquired knowledge or skill. The -questioner in 
this ' instance already possesses the information requested. What he does not 
possess', o% course,! is the 'knowledge of whether the student being questioned can 
accurately * or faithfully produce the known answer. So the teacher's real 
interest is not in the content of the answer pee se f as it is in most other 
everyday, situations, but rather in the student's ability or lack thereof to 
deliver the expected reply. „ 

This is not to say that teachers commonly disguise their true intent nor 
that they could do so successfully should they try. Except perhaps at the very 

" lowest levels of schooling — kindergarten or thereabouts — most students know 
full well that when a teacher asks a question it is commonly to find out whether 
they (the students) know or can do something and is not a search" for the abput- 

.- *•/ to-be-displayed knowledge .parse. Teachers rarely if ever, go out of their way 
to hide this fact. Nor is there any reason for them to do so.' It is widely 
understood and accepted by students and teachers alike that an integral part of 
the teacher's task is to become reasonably certain that a particular piece of 
knowledge or skill has been acquired; What better way to accomplish that goal 
than the kind of direct questioning being described\here? 

At the same time, even though it may be perfectly legitimate for teachers 
to ask questions as they do, and quite understandable as well, there is 
something about the circumstances and the format of the inquiry that injects a 
note of artificiality into classroom proceedings. It's as though the teacher 
were somehow acting or pretending or even playing with students rather than 
responding to them forthrightly and openly. For even if it is the teacher's 
legitimate duty to try to find out whether or not a student knows something, the 
- process itself often has a kind of cat and mouse quality about it that is rarely 
present when people ask questions in out-of-school settings. The teacher, if he 
or: she wanted to,, could as easily give the student the answer as request it. 
. ThjLs/ must mean. not- simply that the teacher possesses the information being 
^^.sdught, '.'as' has already; been acknowledged, but also that he "or she prefers, for 
^H^e time being, ta keep it hidden. Is there' not an element of teasing in such a . 
v - posture? Might not a perfectly natural reply to a teacher's query be: "Awww, 
: . 3211 know"? ... ■ ■ -s , .- ,. • _■. 

': And beyond the playful quality lies something ' even more disquieting to 
contemplate. For, come to think of it, shouldn't the teacher » often, be; in a 
position to Know whether or not the student knows something even without asking? 

After all, it is the teacher's job to see to it that the knowledge gets 
delivered, S§ to speak. Indeed, he or she. often delivers it in- person. What 
■v.. can it mean then for a teacher to. ask a student if he knows or understands 
something that he has just recently been told? Wbaf are the sources of the 
doubts that might lead to such a question? ' './ 

The first thing to say a£w« them is that they are. multiplei All kinds: of 
mishaps may occur between the teacher's delivery of ^tiie . . knowledge or his 
reco^endation that it be obtained f rom somewhere else (e.g. , ■ a textbooMT-and 
i-isi safe deposit, so" tb speaJi, • in the student' s memory bank .or : neurological 
• network or however one wishes to conceptualize its res^i^ Vp^ace^ ' witiiih the 
^person. The student may not have heard or seen what wash's said, , or done. He may' 
° have received the message, but not comprehended its meaning; .He may have 



JACKSON 



understood something perfectly a short while baek bti;t hew forgotten it* Arid so 

on. • . ; ; * 

Moreover^ all thfcse enviBidned mishaps and more that wuld be nam^ h^e ^ 
conceptual source that sets limits on our understanding of all that can go 
wrong. They are rooted metaphorically, in the image of the student as some kind 
of container or vessel in which knowledge can be stoced.\ Depending on whether 
knowledge is itself conceived of as being solid or liquid f the task of the 
teacher, within the terms of this metaphor,' is to see that a sufficient quantity 
of^ this precious commodity is packed or poured into the students under' his 
charge. * : 

But there are other ways of conceptualizing the teaching-learning process 
beyond depicting it as a mechanical opieratio^nvolving little rapre than filling 
the heads of students with a load of tafowledge. Each of these alternative 
metaphors calls attention to additional difficulties that teachers might: face.' 
For example, if we think of kncwledge as (being like food that' is digested*; 
rather than as being* like an object that ^teins its original form or shape 
j£§ide~lts container, we can begin to envision the teacher; as having a quite 
different set of worries, many of which^add to the urgency of his questioning. 
Instead of wondering whether some nicely wrapped parcel of knowledge lies safe 
in the shelf, so to speak, somewhere within the student, he now begins to worry 
about whether it has arrived in one piece, how it matches thf knowledge that was' 
there before, how it gets used by its^ new owner, an<3 so forth. ;,v 

The^ alternate ways of imagining what goes on when teachers try to teach 
do little r if anything, to reduce the tension implicit in questions^ that call 
for a display of knowledge or understanding. Indeed, in some war$s they may be 
said to increase it, That tension derives in part from the fact that the 
teacher's query all too often threatens to produce a rupture in the social 
relationship between teacher and student. Ihe dynamics of this threat are 
revealed in the following vignette. I , 

• _ < : _ : i __ _j _'_ _ _ LiL _■ ■ " s ■ - - 

Suppose a gift of china dinnerware is sent as a wedding present to the home 

of a prospective bride. A few days later the gift-giver calls- the home of the 

bride-to-be to see, if the gift arrived 'safely. "Yes it did," is the answer. 

"I'd like to see for myself," the caller replies, "I r 11 drop' by this evening. " r .l 

What's so strange about that- situation? Well, quite .obviously, the odd^ 
part is that the giver of the gift does not trust the testimony of the bride-to- 
be. lhere is nothing peculiar about his calling to see if the gift arrived,, 
true enough, but ordinarily we would expect his inquiry to cease once he has 
been told that tlie" gift had reached its destination.- His failure to^ do so is a 
serious breach of social etiquette. - 

Though teaching only remotely ^r^ 
relationship, ^itivilar to?;.£he on^n the situation des^ibed tiueat^e;Ot:o^ : ':qme 
into being ' when teaeheirs insist on having students display in detail the 
knowledge they possess^ Hie resemplanee is particularly close, of course, whfen 
the teacher's direct question has been preceeded'by a general query concerning 
the understanding of ' the material being: taught. "Did the knowledge arrive?" 
asks the teacher./ "Yes," nods the' student. "Let me see," says the teacher. 
'What's the matter, 'don't youi believe me?" asks the- student. "Oh, sure I do," 
the teacher replies, "It's just that ;;;;" ^ . ; 



JACKSON 



11 



That what? Were the teacher pressed to give a f rank '-answer -to the 
student's query, one that he may, have difficulty facing up tfe hirifeeif ~ L if ear it' 
would be. that something resembling distrust doss -lie behind the demand for hard 
evidence of learning's having occurred and, much as we "might wish it were 
otherwise, such suspicions often turn out to have b§eh_ war ranted. For the truth 
is : that there are many reasons why pe^e, might try to hide the fact thc£ they do 
not know something, even people who are usually honest about most other things* 
Ignorance is often an embarrassing condition, no two ways about it; It is 
especially so in a classroom after the teacher in charge has made an effort, 
either direct or indirect, to assure that something has been learned. Under 
those circumstances the? student who admits to not knowing what the . teacher set 
out to teach has confessed to having failed in one way or another — failed to 
have listened, failed to have understood, failed to have done the assignment, or 
what have you. He may ultimately be excused or forgiven for his inability to 
respond satisfactorily, but its statuses a failure remains. . /. • 

Thus, it is not terribly surprising to find that many- students will not 
voluntarily expose ^ their (ignorance and will even try to keep it hidden when 
others, such as a teacher, threaten to reveal it through direct questioning.. So 
the suspicious attitude that lies bdhind the seemingly innocent query from the 
teacher is not the sign of a streak of paranoia in his personalis. It is, 
instead, an understandable preparedness based on a realistic appraisal of human 
nature. - ■ ' y ■ . ■ 

But the legitimacy of the teacher f s suspicions does not make, the act of 
putting them ta rest any more comfortable for either party. It is awkward, to , 

the; least, to have to check up on people and it is demeaning, if not 
downright insulting, to have to be checked up on. However much we might try to 
avert the discomfort connected with such a query (and many teachers seem to be 
quite skillful at removing the sting from their questioning) it is doubtful that 
the process can ever be totally painless. 

99., \ 

: _ To recognize this fact is not to argue for the a&andopment of tests or the- 
elimination of direct questions, in class 6t anything of the sort. If teachers 
are to fulfill their professional responsibilities, they often have no choice 
but to insist that students, display their newly acquired knowledge, or the lack 
thereof, J" no matter hew painful j or embarrassing such a disclosure , turns - out -to 
be; At. the same time, recognizing the threat of discomfort implicit in direct 
questions, : tests, and the like, we can begin to understand v/hy t some teachers 
might hesitate to employ such. % procedures ; why, in other words, they might, prefer 
to live with the ^uncertainty of not knowing for sure whether their students have 
• in fact learned what was taught. The costs of obtaining ' lhat inf ormaticn must 
be weighed against not only the. discomfort it might bring to individual students 
but also the potential damage it might do to the sbcial relationships involved. 
We may condeim the teacher who avoids at all costs the slightest threat to a 
warm and comfortable relationship between himself and his students, . as we might 
the parent who never disciplines his child, but: we can at least understand the 
motives that guide him along such a course, of action, * 

Where, then* has this discuisioh of pedagogical uncertainties taken us so 
far? It has, I trust, underscored the central fact with which we began, which 
is that the process of teaching, viewed as knowledge transmission, is fraught 
with unknowns. In holding up for brief inspection what seems to be the four 
major ways in which teachers cope with this condition, it has also revealed some 



6f th| iOTitetlons of^eadi 6f_thes§* str^ out what is going oh 

"insige the heads* of st^ents^- ' Sane of 'those iimitaticMis have to do with the 
fallibility of .therir^ wi&_ the costs,— 

economic, psychological / • and, ^fcial%- connec^d with its .use/i The upshot of 
this analysis may hot be; r^w r ^DUt it is Important nonetheless * < What i t Surest s 
is that in teaching as in most other complex activities* the path.; of reason is 
often forked, Just as it makes good sense/for a teacher to want to- know whether 
or hot his ; students -§reii ear hing what they* should, so does it also moke sense, 
and often equally good sense, for him to avoid the very kind of questioning that 
will yield the most reliable answers to his pedagogical inquisitivQness. 

Hew teacher s jrf handle this tension between wanting to know what 'is being 
learned but riot wanting to spend too much time and energy in finding out and^ -at 
the same time, not wishing to create ah^.undue anount of social discomfort in the. 
process , is partially an individual matterf ; Some teachers t seem content to press 
such queries no f ur^er -ftan what th^ can see with the naked eye/ others insist 
on questioning almost every student at almost every turn; Some use quizzes and 
exams whenever the opportunity permits, others .eschew formal tests completely* 

But; hot all such variations are a matter of personal preference/ It is 
also doubtlessly true that some curricular areas lend themselves to direct 
questioning more easily than do others. We know, for example, that mathematics 
jarid spelling are more acfepted to ^ tests than are, say, social 

studies or literature, / Moreover , rudimentary levels of understanding are 
usually ttore easily reveled by direct questioning than .are higher levels, 
Thus, we might expect to find a heavier use of such procedures in the earlier 
grades than in later ones, / , ' V^.* 

- - , . - ' " . : ■ - f 

Beyond such variations in the adapt iveness of curricular content to the 
strategy of direct questioning lie differences in the level , of social - concern " 
aroused by the tiireat of ^/peo^e rrot knbwir^ what are. supposed to know; In 

short, we worry more about whether gone people are knowledgeable than we do 
about others. We seem to care more, # fdt exaniple, about whether a physician 
v^knows his stuff " tliari we do about, say, a florist Consequently, we would 
.expect teachers ifT a medical school to be somewhat more conscientious and 
demanding about asking questions arid giving tests than we would teachers of 
- floral design. i ' 

'* ^be^overall live! of such worries seems to diahge over time as well. Right 
now we appear to be in the midst of a period of heightened, jgublie interest in 
the outcomes of schooling, particularly at the secondary level and below. 
Consequently, we hear a lot of talk these days about such notions as educational 
accountability and minimal competency testing. Hw long the present trend will/' 
continue remains to be seen, but so long as such a mood prevails teachers are . 
bound to feel additional pressure, upon them to seek "hard" evidence of what is 
or is riot being learned by their students. 

___ Ah additional spar to Qie^CT^^m©S_of dirert^quesiOTs in the clapr 
^irticularly formal_ tests, comes frgm toe grew test . 

development and the associated" emergence of the testing industry. These 
developments likely have a double effect on what teachers :|iq to find out what 

'thefr students know. On the one hand, teachers these days are better trained in 
the techniques of test Construction than were their counterparts a generation or 

-two ago i On the other hand, today's teachers also have access \to a vast supply 
of conmercial 'tests > arid wbrfc^>oks that were riot available P*In the pasti 



/.LSI .. : ' 

JACKSai 13 . A 

F\irtdherrnpre r the develojjnent of mass testing prdgrarris that lie outside the realm 
of, teacher decisibif^inakiri^ (such as the SAIVbr the Natidnalj ftsees^mait of 
Educatiaial Progress) doubtlessly heighten the overall desire of teachers to be 
sure that the material they are teaching is getting across; 

6w(Sjthe_oo^exi^__S_- this mix qf forces impinging on the teadier's 

decisipn to question or not to question, to test or not to test / about the only 
' thing that can be said for sure about such decisions is that they probably are 
not as easy to make as they might first appear to be..- T^o groups in particular^ 
it seems to me, tend consistently to underestimate the difficulty of "the 
teacher ' s posi tion in such matters. The first comprises the /bulk of ; _our L so- 
called experts' in the. field of educatipnal testing and evaluation. Hie second 
is made~ .up of the majority of today's advocates of a let 1 s-get-tough-*with~ : 
students policy. : « " V ^ 

In addition to overlooking some of the psychological and social costs- of 
questioning that have already been mentioned, both "the testing experts and the 
: citizens clamoring f of greater accountability usually suffer from another kind 
7 of short sigh tedness as well, which is brought about by theitf almost exclusive 
reliance on -the particular view of teaching^at has been ^ dOTihant in this 
essay. That view, as has been said several o times, depicts teaching as 
essentially- a process of transmitting" knowledge. ' " . * - 

ifew there is nothing scsna with this outlook on the teaching process, to be 
sure. Indeed, there seems to be a lot that is right with it. The . important 
question , ; however > is whether such a perspective affords a total view. In other 
words, is that all there 'is to teaching, the transmittal- Of knowledge? v 

Some people, like Mr. Gradgrind in Bickeft^s °Hard ^3^es P jgoulcj ^rteiffly : say 
yes.: Indeed, everv- knowledge w^. too highfalutin.a term for olS Gr^grinci. As aT 
^cher all he Ranted to get' across : were "Facts, children, facts! n 'A few flesh 
and Blood teachers doubtlessly would echo the same sentiment today. ^ ^ : % : 

But the majority, I suspect, '.wpaLd be unhappy with such a narrow view. .. 
Even those teachers who are willing > to afccept . as the" central v purpose of theix 
work what I have called its epistemological character would probably insist that 
there is' more to it than/ that. Hew they talk about the larger scope f of their 
mission,^ whether ' *hey discuss; it in terms of character development or moral 
educaticaj or aesthetic apprTOiatJon or social responsibility or whatever, 
matters less here than does the fact that none of these ways of talking is 
reducible, to language Siat is strictly epistemological . All, in other words, 
refer to 0 modes of experience and to psychological states that spread beyond the 
boundaries of knowledge peruse and that are not easily revealed, if at all, by 
questions from' even the most skillful teacher or test-maker. 9 * JL t 

, There are ^ even times, it seems when the most sensible thing for a teacher 
to_ dbj atJS^ or close to it. Elizabeth 

Hardwick^ teacher and writer, describes one such occasion. J'lt's jhard to say 
anything 4 about a fine short story, " she tells us, "I know from "teaching that I 
would ask the clkss to read Chekhov and all I could think £o say to, them was, 
i Isn't he wonderful! 1 M .Host teachers have had similar moments of speechlessness 
i^. all prc*>ability.- ; I know I have. At such. times the question of whether some 
piece of knowledge is or is hot lodged iH -sorndDody else's head' seems like' a 
silly thiirtfj to warife to .knew; So too does the. broader question of precisely what 
ipf juence the : teacher ' s factions have had. . We can do little bettdr on such 



JACKSON 14 ; V. i . ' 

occasions than to join with Henrjr fidants^in his celebraticn _o£ all the ttirigs 
that teachers will never know. These uncertainties begin afresh with each new 
day of teaching and seem to have no, end, Adams hit the hail on- the head all 
right • in what he had £b gay about the farthest readies of the teadher • s ' 
iftfluence, but he could as easily have Osed this close at. hand as his starting-; 
place^ "Near and far^he might have said r 7the teacher' s lot remains the same ; . 
•~ from here to eternity, uncertainties galore." - 



9 

ERLC 



54 



So 



ERIC 



^'^■--■■^■■;i-..:< : David Hawjcins \ • \ . \ 

v University o£ Colorado 1 --; ; " ' " - ; ' 

V J - in this paper I wish to consider . that aspect of educational assesatiait 
whidh is primarily of use to teachers in the exercise of their : art. I shall be 
speaking mainly of the elonentary school ages* In order to consider this aspect 
1^ ^hall ^however lay dcwn certain general propositions about the process of 
education^ of teaching and learning, and about the word "curriculum;". 

, In a genetic sense * education is a process which can be misrepresented in c 
two apparently opposing ways, each of which catches something of the essence ^ but % 
each of which is incorrect if translated into practice and is inconsistent with | 
the other. Some things are complicated enough to require at least two sentences > ; - 
po say them. And as in mathematics f two axioms taken together may generate . a' 
nest of theorems which would in no. way follow from either of them alone. . ; . 

The first of my axioms is that eduction ;— inf dmal e&catiai first , 
c formal education' added — is the central process of culture transmission . By \ 
culture, I jjtean everything which contributes; to children's potential capacities / 
to become competent functional' members of their society — .including; all V:; - 
relevant aspects of knowledge^ skill, character, and commitment. . The metaphor 
dominant in discussions of this aspect is that of the potter and tHe v^el^:^ 
metaphor^ of shaping . Human beings are^ih some measure ^plastie^ and from birth" 
are fcieing instructed, molded, shaped. In culture transmission and culture 
evolution education takes the place of the genetic code .and subsequent 
' embryology. Child development, so considered f is the interaction of social 
nurture wi^ embryology. f/^ \v y ; , \ 

) y lti narrow applications of this view the attempt can be made, to assimilate - 
the ^description of the process of -education under the metaphor of standard : 
- engineering design?; Our • public . education system, dealing with numbers of # 
. ; teachers in excess, 'of., a million, .has evolved—- over a very few generations — 
creating an institution which has in it, across the land, mar^/ : dominant 
informities of practice, of^ daily and ; longer-term rpiitine, of ; ,s antf 
practice. This standardization brings with it, understartiaateiy,. t [x^^m^^t^s : 
of quality control relating to various levels of assessment J and ^ccpuhtabil ity > ^ : . i 

In a simplistic account of engineering design, two presuppositions are j 
basic. One is the availability of uniform raw materials of known properties, 
the other is a system of rules or procedures for shaping and assembling -the^ 
materials into a finished product. In reality, however, these assumjptlms are • 
only approximated, and it is necessary, as part of the design itself, to monitor 
for non-uniformity, for choice among alternative rules, for chance deviations. 

In bringing this point 'of view, — at seme levels of approximation a . 
necessary one — to bear on the process of schooling one is forced to recognize 
a very considerable non- * ; ; ^ _ •.• : ^ •• / __ 

uniformity among children, among teachers and trieir practices, among schools and , ; 
systems, curricula, etc. .Among the many sorts of monitoring' assessments which 
this situation invites is -the constant .assessment "of children' s progress along 



HMKINR 



2 



standardized curricular tracks. This may be the basis for routing children or 
- youths among alternative^ tracks, or for evaluation of teachers and schools, etc.^ 
Such" assessment itself- requires some measures of standardization , as for 
example in the comparison of school s f systems f for making national com 
or across time*. In recent decades a dominant response to this demand has been 
■the creation of a wide variety of statistically standardized measures, alrnost 
invariably ^aper and pencil tests^ arid these tend to become implicit definitions 
of educationally desirable objectives. What is outstandingly bbviouss^is that 
their . results reflect a quite gross variance with respect to the" .; erstwhile 
uniformities which the metaphor of engineering design has presupposed; much of 
-this variaric^ remains unaccounted for except in terms of conventional ideas such 
:as "ability," etc. -- T .->' ' / ' * : '. 

... v t nw turn to my, other axidn. 'Human beings : are by nature active model 
^ toilers; their learrujig ~ from birth — is essentially an autonomous process 
in which their behavior~(conc3uct) is beiftf^^ i i iodif i e d by processes 

assimilation, accomodation and equilibration (Piaget) which involves the mapping 
of environments and the planning of conduct, both processes taking place at 
levels of motivation and informational complexity wltich take account of motor- 
sensory input but which are not accounted for by external sensory input 
(including "reinforcement"). • 

Such input is in part an independent variable, but in part is information 
elicited by the individual, in part dependent on his activity and 
discrimination. Those aspects of nurture arid environment which are relatively 
independent of such fel ici tation, will indeed have a directive influence on the 
models built, and fv support or discourage children 1 s general model-building 
properties, vMch . ^ are . by\ their nature ' ainiulative or autocatalytic 
(intelligence) . 




In the course of such* careers human beings are congeni tally diverse in. 
their model-buildng mcfcivations and propensities. Beginning from an ^initial 
-genetic diversity these differences become 
but, also can ,be seen as alternative 



amplified in some essential % 
pathways along which axmoh 
7 characteristics of habit, language, of institutional accomodation — are or 



be developed. What appears from the view 




uniformity is from the point of view of the second, more adequately described as 
what we call individuality.* v / - 




int of the firffi axiom to be non- 



ERIC 



v' jFtom thi? point of view the readiness for learning . is primarily a matter of 
individual developmient to #ate j"'of individual motivation.'. !tt\e metaphor of 
transmission becomes inappropriate; llarnitS^ is primarily an activity of .the p 
learner abstracted from information selected or elicited, by; the learner from 
primary subject matter, from the accessible world, through his selective 
interaction with 'ilif in this activity the_ijarner_is an eolithic craftsman, 
building structures — models of his own, using what has been already 
assimilated, ; including frames of thou^t alreac^ stored from previous learning, 
with ends-in-view which are themselves framed in terms of prior experience 

The. role of teacher, seen in the light of padh of Jthes^ ;' v sii^:iM\.^l:hr arid . 
exciudirig the implications of the other, is a* kind of stereotype^ ' Under the 
first axiom the central role is that_df instruction, leading students along a 
pre-determihed pathway, bri_their part a step^bystep acquisition of skill and > 
knowledge r shaped ~ informed^-^by ^the teacher ^Lsourc#iv_orl.^nowja^xs .more, 
typically- by the teacher as adniriistrator of -standardized sources : text^gpks ~. 



HAWKINS 3 
workbooks, "packages. "•-'.. 



Under the second axiom alone the role of the teacher is no longer primarily 
jjhat of an instructor, pie teacher becomes a guardian, a facilitator, a 
"support facility," organizer of a material ambiance in which children 1 s model- 
. buildings propensities will be supported, providing ' materials which they can 
shape in accordance with these propensities, each in his own way and according 
. • " to his own readiness and momentary motivation. If there is educative direction 
ifi this provisioning it. is indirect; if there is instruction it .is instruction 
of demand, assistance in pursuit oftf an end set by the learner,.* 

In a superb philosophical essay, still in print but seldom read with any 
_ due regard for its content, John Wey(l) sets forth the dialectical development 
r . of these two axioms when they are fismly brought together. His first' step is'to 
4^ set forth each of tese axioms — as I have called then — in such a bald and 
•~ ; .stereotypy each 'othery hot only in logic — - 

but in a whole stream of practical consequences which each seems to entail. • 
These contradictions become the armamentarium of warring parties "i h n ; a perennial 
debate, each charging the other as espousing ideas and , practices which doom 
education to failure. 

Dewey's second step is surely the right one; it is to say — in/eff ect — ' . : 
that both axiomsf are correct, and that each, taken Without regard to their joint 
implications/ Jill in fact bring about the failure which it is accused of.;: 
Without accepting both axioms, in some suitably refined form/ one simply cannot 
define the central problems of education. . ^ # 

Unless the classroom is Jbofch. child-centered and subject-centered the basic 
conditions of educative success will not be met. The teacher's central role is 
that of bringing about a match between the child and the curriculum in an 
enriched environment. Such an environment entices children's curiosity and 
, //gives them wide access to subject matter. It leads than into the curriculum by 
selecting, reorganizing and embodying its content in that environment, thus 

, "directing by indirection." Dewey was aware of the fact that there is a large 
multiplicity of pathways into the exploration and final mastery of any domain of 
elementary subject matter, ' and that • it is only by the teachers' art that 
pathways can be found to match the propensities and talents of individual 
children, and sponsor the kinds of associated activities which will bring them, 
as a small- society, to relive the intellectual and practical learning and 

• invention of mankind. Dewey discusses at- length the contrast between the 
standardized logical organization of subject matter (e.g. the textbook) and what 
he calls the psychological organization, that f ran which a teacher, tawing? well 
the. logical organization, will reconstruct accessible content J from it to 
maximize" access and commitment from diverse individuals and groups of learners; » 

• I criticize, this excellent essay, and -Dewey' s other related writings, for 
two omissions. The first, of which I need say no more here, is that it implies / 
a profundity in the understanding of elementary subject matter which teachers in 
fact are typically lacking, and in the development of which they need kinds of 
continuing education and practical support which our school systems — dominated 
in practice mostly by the first axiom, not the second — do not provide. The. 
second and more basic criticism is that Dewey here, as elsewhere, neglects or*' 
', fails to emphasize one' central role of the teacher, one which when- described 
will lead us to face the central topic of this paper, assessment in the service 
0 of teaching.' It is a role which requires full accordance of both axirmn. r*wm/ , • 

ERIC 



was critical of m a ny of hi s own f6 ii c*;ers the minori t y camp of progressive — 
education^ fornotaccepting the force of the first' axiom, but he still reserved* 
his big artillery for -their - opponents; ' ' * 

To pit the point most sharply: In the essay referred to Dewey recognizes 
that a teacher's role is that of creating an ambiance in which "the "child and 
the curriculum" are" broucfat together in- some fruitful matdii^ ^relatiair an 
ambiance which includes the teacher as an adult intermediary f as one who evolves 
that ambiance in step with children 1 s development and learning, unpacking and 
reconstructing curricula in the process, pewey has however nothing to say about 
the instructional role of the teacher in such an ambiance, and so implicitly, in 
the end, gives support to these of libertarian or anarchistic persuasion who 
■ , . minimize that role in theory, and neglect it in practice. How -then should one 
conceive this instructional role, while having due regard for all c the 
implications as to the necessity of self-directed activity in model building 
- — - ^of - the . second axicm?_ : ....... — - r — - - — 

: In that enriched ambiance which Dewey s "rightly conceives as a necessary 

condition for- adequate education, children will have choices, and if> the 
ambiance is well-designed and evolves well, these choices will be educationally r 
■ significant ones. Recognition of the central ity of children's freedom to choose * 
within such an ambiance is an easy consequence of the second axiom, and its 
advocates will often use the location of _ "giving choices." The practical 
translation^ this_ "giving," is of ten_that in what nowadays would be called an 
open classroom _4±iere is a diversity of activities and materials available and 
"set out" for children to become engaged^ with, while a teacher is available, 
moving about to assist, to question, to encourage*. 

Desirable as all such provisioning may be, 'as a matter of course, dne must 

- question whether or how — though it is often desirable and too frequently 
lacking — it is really of the essence. Classrooms which appear on the surface * 
to latek it may nevertheless be excellent, and those which provide it may fail. I 
believe the essence,^ from^^axian II, is of a different t order; Let me say: 
significant choices are invented or constructed , they are very seldom simply 
"given. " The process of choice is part; of the model-building activity, of ■ . - ' 
learning, not; something prior to it which can sbm&bw just be "given!! *in the 
spirit of "here are the' alternatives, you choose. 3 " Alternatives presented in . 

• < this way represent superficial or conventional choices. At best they are 
initial s moves f moves designed to elicit information by a teacher, very seldom . 

; more than a p&tential doorway into subject matter or a source of steadying 
involvement and comprehension. In our own experience with early math, and 
science we .have seen many times that a rich array of enticing materials and 
^eri^eria will prove attractive to groups of children, in their own classrooms 
tfr in visits to our* advisory cefntery^ On such opening occasions a laissez-faire - 
- ; ' ; • " attitude-, is 'f pie; a_ time, f uliy :afpi:^i'^^hei^sT not rush in y to instruct ~ 
' but It is not a long" time typically, ^t ^ v/hat one of us called "Christmas 
; morning. " If this is continued too long,; 'one will begin to see the fading of 
interest, "running out of steam," the signs of boredom more often associated 
with conventional classrooms of too narrow a ' style. Cut off , on the other hand, 
by a "new let's get down to work" contntria, such an opening phase has* little 
educative virtue, it is only a drop of nectar. The crucial phenomenon of : 
significant choice comes rather- f^m cortinunication aroupd fchese early indicators ' 
of interest and readiness, arT 

V curiosity, fot the. .acquisition -of ^ 

ERIC 



mmms 



term/ in whic^ what has been only a preliminary 
filed, and .retrieved in later use, . 



ejcploraticai is worked but* 



A teachers 1 role in this process is that of helping children to find 
rs of learning. (2) Oftis role has two major aspects. One is that- of . 
assessment and planning, the other of investment , of joining, as an *dufij|frin 
the pursuits being shaped §nd fostered, investing .thea f in pe^es L of chi^|n r 
.with adult enjx^ri^it and digriitY. J_% p^l^tuirn to'the first, 3s the *jttairi 
concern of this essay; but toe second is important by way of background. Hie 
qualil^ of a teacher 1 s own, perception of subject matter determines the frame or 
frames within which children's significant^ choices can cane to definition, and 
is therefore crucial to choice, in part this range of potential choice depeiids 
on the teachers existing repertoire of available, materials and their uses. If ". ' 
this is narrow and coliventional, potenti^ choices are l^ited as well. Jf it! is . 
wide,^ there is a greater; prbbaMlity that ■ toe teacSer cian h^p devolve ^ ... 
choices consonant with the beginnings which children will show* themselves ready . 
tpjnake and extend* 3ince^ a -teache rs s ubject matter rai^e and^in^rstandin^^f 
subj^tr4& that teachers' t gajjacity for .its Inve^itur^ 

familiar and attractive, it also limits; the teachers abili^lo assess eM -jaMl^S 



I give a snail example. A second-grade / teacher has been introduced' to 
geoboardsland has given these, with an ample supply of rubber bands, to each of . 
27 children for a lesibn. They are first invited to play with the figures they 
can make. :i |3hen the word "rectangle" is discussed- and illustrated, and ffelly 
everyone is invited, ta^J^nake : a rectangle" on this lattice of nails. The 
teachers 1 example has been' a rectangle: - > 



• - 

-4— ~t 



two high and three wide, * and almost all now repeated it. We are going to 
"count -the squares/" and that will be our introduction to area. Each child is 
asked to count 'and most say "six. " But one child has made a first rectangle 
this: . • % * " * ' • * • * V 



i 




u ^pbserv^_s^ him look about and — • alas! — change his rubber band feci tog 
nl^convent ioral ? f orm . But then again — mirabile visui — from some inner 
^ conviction antf v cdurage he changed it back again. When his turn came he 
therefore said, "six." The teacher, dutifully following instructions, not 
understanding, was disconcerned * — ' something was wrong. Af terward the observer 
.,: was able ttf pick up the neglected opportunity, and the way, was opened for 

h looking * at this * sguare oh the diagonal and a more adequate approach' to the 
concept ^of Th^ child i n questi on was ready for and del ighted with this 

Pfportunity, and his work could have provided an entree to some real geometry 
for otter children as .well. But the lesson did not include any such 
opportunity. (3} 



mmms 




— Given this sort of definition of a teacher's range" — sa^y^toSeguate i 

the* sane cited -r- one can .then discuss matters of assessment and planning/ 
within that range or extending it. 

" The situations presented by such opportunities is one which calls for a 
certain type of information matching, ft channel is to be developed which gives 
.assess to subject matter for children who have given signals is to how that 
1 /- access mdy/ be achieved; among potential ways of access, some are suggested by 
those signals as premising. Starting from the other side, a match is to r . be 
achieved between some subject-matter topic or content and ,<-a diverse array of 
;.:"<*ildren with., t±eir available talents and resources. 

' ;: S Hbwwer. m diopse to 

l^g r : : v for - ; specific fea^hs^ : ^^:some behavioral, • fc^ nonr- 
behavioristic label; A relevant operative term is understanding . As teachers 

we~-^shr^to r ^ 

understanding of the unecjual arm balance. The context I have, in . mind ip work 
v with some variety of materials such as v/eights, some identical with each-other> 
some diverse; a long board to be balanced (or unbalanced) ; on a rounded support 4 
(for stability) , sheets of hardboard to be . balanced (or unbalanced) on a 
; : h materials such as Tinker toys to be assembled into 

arBitta^, ^ to balance (or not) , on a single point. • .-,.>•> 

>S :: \. \ In the course of initial play with such materials children (and adults) will 
give behavioral evidence as to their to of what, fori'^horthand, we 

V may rcall the law of moments and ol -stebility or instability in equilibrium, 
v' v Students' achievement of such understanding is our curricular objective. J, 
shall say, however, ^afe this objective is not to be exhaustively defined in 
- terms of specific trehftvidrs , as that term is usd in the recently (an$ ^tUl 
OTrreritly^;^ The latter 'notiori'Als 

based on^a^^losgiical o^methodologiral opinion that the content of learning 
can be defined only in terms of objective data r some specific itemized list of 

• specific verbal or perforatory "behaviors/ 1 i.e. responses to such questions or 
commands as "place block A where it will balance block : B." The listing may be 
long r , but when set forth adequately it will give a behavioral or operational 
definition, of the degree to which one has mastered "balance." Such a listing 
can. then become, under Axiom I, a guide to the teacher, by which students can 

• seriatim ' be o taught not only general verbal responses but also correct 
7 performance. / : ;: v 

: _ « In oppositiOT to this view I piit forward the view that understanding is the 

oper atiy e^word • understanding is per se hon-behavioral; on the V. other- hand 
evident^ tflgarding whether, or the degree to which, something like the concept 

, of balance is understood r is behavior. The view rejected is a hangover from the 
logical positivism of the 1930 's, and its verification theory of meaning. 

. According to this epi stemological* theory a * meaningful scientific statement is 
one which can be translated into the list of observable phenomena which can be 
said to verify ^it. Hie simplest refutation of this view — now almost 
tmanimously rejected — islihat the list of such directly observable phendrpena 
corresponding to any hypothesis of scientific importance is always in principle 

} inexhaustible. The hyfx)thesis can and must be testqd by observation, but is not 
defined by such evidence; if *it were.;$s.-£d£tnuiate(3. it would be useless; since 
all d£ its implications would then be exhausted* (4) 



HN7KINS v 



The operative meaning of "understand" puts this "concept in the -c ategory hf 



-a term-, wluch ' cannot be exhaustively defined by any pre-determined list of 

• ; tehfViors. .If we could train students. to a criterion level of performance with 
i respect to his understanding — so defined — it would not necessarily imply, 
understanding, and indeed — if the- training were sufficiently routine — might 
wholly miss the mark. Indeed — according to Axiom II — our aim is that the 
student, should build some Msi for the wide array of balance phenomena, one 

.. .. which, is in some measure equivalent to the distillation of simple principles 
' . first enunciated by Archimedes; should not only build, such* a model but should be 

. ■ able * to retrieve it from memory for use £n diverse situations of a familiar 
kind, . but also for trial in diverse- situations, some of which are novel in 

- «' aspect. The extent to which such models have been built, at any point ifi 
learning, and are retrieved in new situations, -is testable in a teacher!' 
. .observation,^ and it is from such evidence that the teacher can in turn attempt 
to ' build a jaodel of the student's model hy comparison with the teachers' , own' 
, model of phenomena — in this case balance. ."*'••>'•' t ■ V 



/Well-constructed - models have a', characteristic power (5) to" reduce the 
redundancy of experience. Behavior can exemplify- the use of a model' and give • 

• clues to its nature, but in its own nature a model- is of a different order. In :- .h. 
- its ?own nature a model is first a : way or habit-Of^. selecting, organizing and ; :'. 
^providing information, and then later ~ by abstraction — an object in ''its own *V * 

right, a conceptual reality which we can describe and analyze — e.g.. the law of 
Imoments, tw conditions of stability A? r, in the language of physics or 
mathematics, i&fc the- language of human behavior. per se, though as a retrievable 
model it must be richly indexed to phenomenal and. behavioral imagery. 

* ' * * ' i . ' • ' ■* * • ' * ' * ' 

Understanding, so conceived, is, in principle, never complete. * ffodels in • • 
this sense can become linked to other models in a network, thus further reducing ' 
residual redundancy, 2.3*, and increasing what might be called the cross-section \ 
f frea.pf possible applications ("transfer of learning") . So the conceptual frame : " 
! of- balance may be linked to that of mechanical work and potential energy, or- to 
.other cases of { the use- of an efficacious center ( Hoi ton) , to* geometry . 
(Archimedes) , and so on. In another direction it might become linked to the 
barometer and the ocean of air, to still other phenomena of equilibrium and to 
the image, of the potential well, etc. , « : * 

The representation of understanding by the idea of a" growing^etwork serves \ 
also to suggest why there is wide latitude for educational choice ' in the time- 
ordering' °£ many specific .topics, at least at early levels of learning. • : 
Important ideas — frames and modes of understanding — are met along many 
tracks of learning, that, is why they 'are important, and that' is why subject 
matter is open to reconstruction for learning in many ways. ' 

From this assertion of the adaptability of subject matter I turn to' the 
other pole, the adaptability of children. It is only when these two kinds of * 
; adaptability are seen in conjunction,- 1 propose, that the child and - the, 
curriculum can be fully brought to harmonious relation. ' • . 

To begin the discussion I propose t;o introduce two subsidiary lemrnas about 
the assessment of ability. One that if we are to speak about measures of '. 
ability or talent, in the biographies of any, individual at any time, this 
measure should be conceived as a vector of many dimensions, not: just one 
aggregate Jfe.g. I.Qi) or a few. (e.jgj the subsections of the individual 1.0. 

teStS). Ttie individual itV Of Ifiarnpfs fmr4.iPR l>: 1> is a *-hanram =sHrti*4- *-u~ 



ERIC 



/HPWKDJS - ~ 8 
career of the human model- builder , 



vA,, It is practically^ confined by the fact that in any group the rates of 

track are-<5dnspicubusly : different, there 
:-:;v-- are conspicuous changes (often inversions) in these rates as a function of the 
K : M.i v,.'. Hinds of ambi ance , access , and teaching involved. The second lentna ^ is . that 

roughly pro^rtional to relevant antecedent leaning. Tfte 
" f irst ].emma implies a prof pe/ ;a. vector of abilities and talents (whic* I 
: , visualize fh polar^^ of which no single function (average, ^etc* j 

either very meaningful or very useful 4 n teaching* The second implies that in 
any given specif ic 6i reckon on the polar profile, the distribution of abilities 
in a group should be something like the log-normal distribution, with a large 
.,; ' variance between individuals. * Hr V 

tfader these immas it _wiifr_follbwj most in^ortahtly^ that the Assessment of 
strengths — . peaks of background,,, skill f knowledge, talent — - r is of prior 
: • v V' v . • •■ importance to what r 4s; also necessary,; that .of . weakness, low ^points on the 

• V : prof ile; ^TfiusT^^ 

a for access to geometry or arithmetic or reading from one with special verbal or 
medhanical facility* Since rates of learning ate dependent on learning already 
. < : achiev s edVi the potential for bridging over from an existing strength: or talent to 
v^?v overcame y 1 weakness , alone. But here the role of the teacher is : 'pgirAndiiritV 
v ''S-;- f inding_v7ays_of vtoilding crossover linage? te^een ; areas' of st^ngth and o£ 
weakness, and thus helping children to find choices which are both attractive 
^dedura * ^ 

^ : These 1^o lepias, X believe, indicate the principal reasons Why prevailing 
"ideas, of f om^.assessment are* of very limited use in teaching, and often are 
damaging. As to the positive side,. Scores 1 09 such,, ^tf are 
conf irmation of what teachers do or should, al ready abundartEly know. A child whb 

• has becone seriously^ addicted to reading. long far exceed, in score, 
" the ; age norm for reading abHity derive^ of such tests. 

The same is true of arithmetic. Tt> demonstrate reading ,levels slightly above 
these, norms may comfort a teacher, but |t is surely no sign of excellence, in 
children's work. Moreover to -.aim instruction at the typical content of such 
tests is in most cases to substitute routing skill, training f or > the more basic 
art, that of investing reading with value for children in relation to their 
expanding interests in the world around theni, in fantasy and story telling, in 
writing of that which they deem Worthy to tell of their own lives and learnings/ 

If the above outline of the desiderata of successful teaching is accepted, 
then one has a background for the selection or invention of specific means of 
assessment which such teaching r equi res . A f if st; A consideration is that time 
scales, the characteristic return-time f rom assessment to its uses in teaching. 
These vary from minutes tOi.parith's^ ifedbrds ^ii^i^m^i^} or on paper) are vital 
because the way assessments i'rilu6nce teaching neefds- -to be monitorfed by the 
teacher ; individual decisions in teaching are fallible, and their, success, or 
failure should confirm or mo^fy teacher: s^^rqfile models of individual children 
and should contribute to the teachers' own professional growth. _ The design jof 
■ professionally useful techniques for assessment and record-keeping must come 
however as a harvest from successful practice and is unlikely to be provided by 
prof es?ibpai ^test-de signers unfamiliar with the needs of the teaching art. I 
isuggSfet that we should examine; examples of such techniques when we can find than 

• ■ * proposed pi in use* * 

O 

ERLC 



9. 



: ?i v f n . what ^ have said above- about .the multivariate : and log-normal 

distribution, nature of such data, they are unlikely to resemble formal test 

scores, though they ' may sometimes incorporate such measures. It should be 

remembered an this connecti*** f^^Zt&i&ie'ywio cteiffiinaticn is a 

measurement, and that where the number of' dimensions of interest exceed the' 

2^ r ,°£. dat f. SUCh dis criminations are likely to' take the form of- a- paragraph 
than of a\ number. .'••*.• = . . ' ^.^..w., 

; 1 1 HJ 10 ^ 5Ses ? nent f ° f a more long-term relevance in teaching, my theorems 
and lemmas.'do not exclude formal tests — 'standardized or not — as sources of 
confirmatory evidence useful to teachers. If my argument is correct these bv 
themselves /- though of limited usefulness - sm be usd to, sample children's 
learning and skill in subject matter areas, provided they do not get confused 
^Snificant^ways of defining the aim of education. \Hiey can sometimes 
reasonably - be, considered as necessary conditions of educational success, "•' but 
they /.by -no means should be confused with what is. sufficient oft — to use a 

..currently • fashionable terr i — basic. ' , > r 1 e 



v.- 



• A-;-? . : • ,, v 
■ •$>• . • 

: \ 



;«£ . .. . v • ' 



ERLC 



IIWKINS 



10 



NOTES 



I. Jbhn Detiey, '^e Child and The. Curriculum. " 1, 

2 ; in wfet f^lcwsr I am especially' indebted to Frcuic% Hawkins r thou^i she 
should not i-!S^[ •; held; responsible for my' generalizing intgrpretaticns • cfV^lfflE 
LOGIC OF ACTIOfJ, PantSeon f 1974/ and "The Eye o£ ^the Efeholder," in SPECIAL 
HEOCATICffT AND DEmOPMETTTr S. Meisels, ed. / Univ. Rarkr Pr^r " Baltimore, 1979. \ 

• 3 y See- also ; Frances- Heiwkins r * ,?r the E/e of ^Sfce Beholder ^ • • — ■■:-^--':/ r - r — 

4. cf Ri Bi Braithwaite, Scientific Explanation, Oxford, 1963V. t; 

5. for a formal definition of > "power" in this sense see D. jiawkins, !'On Chance 
and'Choice, * REVIEWS OF MODERN PHYSICS, vol; 36, no. '*2, Af^r 1964, pp. 512* 

s v:.-^ . : , ; . ... ■ * . ' - • ' • 1 > 

6 . cf . D. Hawkins, rbn Understanding the rtlnder sfe^jrfdifig of^<£^ 
Informed Vision, Atfathon Press,' 1972, and at ^another level/ Edy/ina Micheher, 



Understanding Understanding flat±iematics r " 



V 



/I : 

•4 



est 
•ft 



ERIC 



In the ; discussions^ of our panel several themes emerged time .and time 
again with great forcefulness'. The issues these themes dealt with were of two 
sorts. 

r The first kind of issue raised ^was that of the constraints that 
present institutiofial structures effid organization place oh possible alternative 

assessment practice.- i^e second kind of issue raised was the nature of the 

desirable features and properties of new alternative assessment practice. The 
threfe :v papers that foil cw> by Mrker Damon, Asa Hilliard and '. Howard Gruber & 
Robert Keegah .address these issues directly. 

J. Parker Damon is principal of the McCarthy-Towne School in Acton, 
Mass. He writes from the perspective df a practicing school principal, .That 
jperspective. is augmented, and cop^CTented by his experiehce^s^a Ford noundation 
Fellow with project TORStJE at tfie Education Developnent Center during . the 1977 
78 school year,, and his participation In the 1979 National Institute of 
Education tonference bri Testing, Learning sod Teaching. 

Dr. Damon ; believes that schools and teachers have all too little of a 
" precious commodity, called time. Thoughtful instruction- and sensitive assessment 
take tiiie. In the f irs£ part of his paper Jhe shews how the time demands of 
present assessment 7 pr^cti'ces cut deeply intQ;\|eachers' availatol? time, without 
the compensation of yieldi^ return. 'W^.-s 

£ In t±e second part he outlines some assessment practices 

that are both alternative to, ' and complementary to standardized testing. In this 
section^ he draws heaviJ^^pn the ongoing experience of his own school as well as 
the experiences of otmer* edu^tors : with whom he is in close arid continuing 
contact. ; ■■■ . . ; \ : *\ o ' : * : 

• 4 / In the last ; : gart of#his paper, Dr. Damon discusses the several sorts 
of suj^ort necessary "t^'ci^^^ractice; In particular, he points out that riot 
all problems are solved ^^|id^ng money at them. Sane sources - of support ari 
there for* us to use Without further expenditure of funds^. ttiese hew sources of 
support involve the intrp^ actors into the educational scene in the 
form., of , parents and _ bidet ^students. _ Ihey involve the _ encouragement of teachers* • 
professional activities and development. Above all, they call for a more 
realistic and informed view of the realities of schools arid teaching. ^ 

. * ' ' >__ ":t "\ • ___ \ _ __ __ _ _ 7. : ' 

i Asa G. Hilliard : is 'Dean of the School of Education of. San Francisco 

State University. He wtites JErom the dispassionate perspective of the scholar 
and f rem the Impassioned perspective of* one deeply conmitted to social change in 
the United . States. _•_ This counterpoint of perspectives recurs continually 
rou^iout his contribution to this volume. 

• The thread that ties Dean Hilliard 1 s piper together is ^the c^lebiS^ojl 
of diversity. ^Peo^e differ from one another as^irriividualSi ^e^W^ form^^^ 
groups, either under their own volition or under pressure from others, ..the/ 
groups they form differ £rom one another. Jerrold Zacharias onqe saidr "childreh 
are different from one another,, and schools should* make , them /more :;isa®'«". Asa 
Hilliard clearly subscribes to this view. v . .>/ - 1 - 7"777\- : 0 




culture and e xpfores-gom e of the~r c&sbnsHAat-cure 
insensitive as it is to cultural y^Lation and diversity* He goes ori to, examine * 
the meaning of the ^term^ ?§e^^H^^«tion' and r the interacting _tjrriad^«f ^ v 
considerations of the type of test £the use . of the 'test and the user of •tK^ test - 
result. All ^too, often, schools and society have paid dearly for the confusion of 
these considerations in the jpinds of the publ ic. Finally r in closing his paper , 
Dean Hilliard draws up a list of guidelines for the shaping of new assessment 
practices and instruments are very much in 9 the spirit of the other contributions 
♦ - to ; Jthis- volume* ^ ; , ; 
. « ■ . _ . _ . . i- - ■ -■ - - ■ 

In the long run one of the goals of ectication is to have students 
. internalize the assessment function and reflect on the quality of the own 
learning and doing. ^Indeed, leading an "inspected life" may well be regarded as 
the. hallmark|t>f a successful education. \ ^ f 

\ By and large wdon't devote much eff ort..i)i our formal educational systems; V 
, V to helping .student^levelop J±e ability as well as the inclination to do this.V 
Howard Griiber 'and Robert Keegarii of the" Institute for Cognitive Studies at; /• 
Rutger ^University, describe a course in psychology they "off er v to ~ non— V; 
traditional students that emphasizes the importance of reflection on ones own 
thought and learning and offer some explicit suggestions drawn from their 
experience to help those that seek to move in this direction. ; 




* . v 



V,. :■ 



ERIC 




M(X^rt±y-Tc?wne School 
* Acton, Mass* ; ? . 





DMSBJG TW^^ TO THEM? ; v S^":^?i> 'V;^.- ' ; v ^/ 

Standardized tests have an impact on curriculum *fcbfitent budget priorities, 
etnd faculty assignments, They are used to identify individual students, for 
inclusion or exclusion in special programs. Ihejy influence, teacher • behavior ♦ 
Sonet imes this influence ;is great, sometimes., not. V^oXe ..unite of study may be 
added to or deleted from the curriculum"; time allotments devoted to a particular 
activity may be altered; sequences of learning experiences may be- switched, > As 
result of poor performance* oh a language ; ^m&^^ics w ;Si&sectiOT Cot a test, a 
district or sctiool may purchase a whole new series, of language arts textbooks. 
Teacher? may be told to spend <npre tinte ;on thik "area of * instruction in isolation 
as .opposed to integrating th^t^aiihir^ of gcaitriat^ ^<*tfiticxi^ :"u^: :: 'aSa* 
capitalization with the students' other work on reading and writing. A -weak' 
• shewir^_ teacher ,to revamp the 

curria^i^ so th?it jStudents will have to use resource books in* place of other 
; resea<& ^ fivery one of these influences work in the direction of 

f urfcfier, Constraining the time the teacher has available. ■ 



.recess, physical education, art, music, and special classes; for certain students 
- are deducted from the twenty-seven and a half hour school week , not mtich is 
left* For example, during a typical week the/time not available for whole class 
instruction (i.Q., all students present, in ±tfe classroom at the same time) might 
include: ; ]/2 hr/day for* morning meeting -and. predismissal cleanup = 2 1/2 
hrs/wk; *1 hr/day for limc±i cleanup r lundhV. lunch iecess = 5hrs/wk; 1 hr/week for 
M art, music, physical education =-3 hrs/wk; 1 hr/day when some students are out 
of the room for special clagses = '5 hrs/wk; 1/2 hr/day for recess or other 
kinds of recreational activityr^ 2 1/2 : hts/wk; 1 hr/week for / unexpected 
.miscellaneous activities/, toe teacher may have eight half hours per 

- week when all .. the students are present. , These hours, however, may not be 
-available in coherent blocks of, time 'or at the most advantageous times of the 
day or week. Thus when a teacher is faced with making the best use of both jbhe 
nineteen hours when not all students are present and the eight and a half, when 
. they ate, it is not surprising to find other pressures or incursions having a 
■ mairked impact on teacher attitudes arid behavior^ ; . 




ERIC 



experimentatxOT/ disoossibn f are canponents of 'the instructional process _§fiat i ; ■ ^ 
, require a lot of teacher effort and a lot 'of available instruc±ioriai time;. ; 
^ in the form of standardized test outcomes will; 4 • 

" : ■/ . whatever their merits* force* other priorities, activities/ materials : or; methods . 
>" ; . ' to give way; in this .way tests can have a direct impact .on classroom instruction - 
*that the 'teacher may, ot^ay*not f agree with. ' r - < * 

: : . In addition, there may also be ' indirect kinds of impact that are ndt 
•/ 4 appreciated at first. Often someone ot^grpup other, thaij! th? classroom teacher • : 
believes the. test results signal : something .different is • regi^redr - 
^Administrator parents 1 perceptions, and^ ^citizens* cphqeriis may 

. pressure the. teachers- to do . what 'the teachers know to be. unnecessary 01;* wrong f 
^ fail (because the ;% 

^ ^ Impacts of th'ese sorts are ^c^nd-h^d> indirect^. - 

: * ; Tfte daily instructional process is, in the main, unaffected : b^ the , > 
/ information produced by standardized tests. TO the extent that there .Is* an 4 v ' , 
impact, "it 'is usually a-negative orieV ^ Baker points ,. but th^t "studies show - ^ ' 
% thfit w^t teachers do as a result of test scores is; to drop whatever it 4 s they; r . 
are working on and :do ^something else, or to rep^at^ i/Hat tii^ ^ ^re-Ydpirtg more 
* f requehtiy . Nfeither pf these are examples of positive or constructive^ ' use Of - 
.. test information.; .teachers are not using the information to w oj?en up their - : 
: instructiaiai repetpire." i \ ■.- ' ® - ? " . 

^ r ^/.^^d^EQ h^y^ . a variety of good reasons for j hot using -the data produced by '.. 
;\ stendMrdiz^d tests. ■ ^However r these reasons are |>f ten , W^lobked by those whose 

views of l^ac^er^ myth.vC^ v 

' is, t±erf of e,' ^ , . o : r \ T.TyVrN ^' V . — - . 

^•'i;^; ~%acSers ^ Jr' 

f : theif work- time twice w6rk 9* to fhfor 5ti ^ r 

weeks ^ the yeaf . . : y yy \'y' m :' ' y^/f y : ' 7 ?V '• 

--Teachers want to . be ^tounta&e for their performance. But they 
•"• - ^so waife ahd deserve su|^ 

professional improvement rather than simply an avenue for blanie; 

^-Tfeachers welcome assistance inten^d to iS 
..... learning, experiences. Successful materials and practices are always 
... being -sbucfrt. ; Ihus anything which is ea^ and effective in terms of 
providing teachers with accurate,' insightful, diagnostic; relevant, 
" imtiediate, concrete, . ccnnpl^te, and constructive information would be " r \ ' 

" . - . • '. welcomed; Critics of standardijz^^ tests; ar^ue that none of these : 
- : f • ' , criteria is v 

^ Iteaciiers are willing to devote -extra ; time ; (howeve^ defined) to 
; improve the learning experience^^^ 
/ inqludes tecoming more_ proficient in the use of , tests and other 
.;' : :' . ;.. ' assessment practices. The* fact is, however, following participatiw €^ 

in workshops and ^courses designed for this -jJurpose#- the active^- 



ERLC 



DAMON.. 




•/^ere j are"'sevei:al reaspns why 'teaqhers f ail ,tb' use test data. Firsts "there 
:arj;>^ m^lmeph^l^al Ift^dpents that get in;ft!£ way. • Turn-around tune from 
^^'^B' "stSaeRt mkes the test to when the. teasel: .receives the fesiii€s-iis too 
lori%;v£^ be no more than a day ok 

•so, Usually, several weeks to, a month pass and /frequently there are unexpected 
dejays. Test items bear marginal resemblance to daily classroom work. The 
relationship of test goals" to teacher goals and to each teacher's sequence of 
ingtruction- to reach them occurs only by chance. Moreover, the information 1 
provided to the. teacher is usually too sparse or too superficial or both for it 
to be/ of use even if it arrived promptly and related to what the teacher was 
\ teacl - 




A second reason for the non-use of test results comes frem the constraints 
*>£ the materials schools and teachers' must contend with. In determining tohat 
- y instructional materials they and their students will use, schools and teachers 
j.- usually have only. two choices. Teachers may buy them from suppliers whose wares 
are practically indistinguishable, or make them themselves at night, on 
weekends, or during vacations. ■ The latter path is demanding. Adapting, 
collecting, creating are time consuming efforts. It is unrealistic . to expect 
teachers to discard what they have created and believe to be> worthwhile on the 
basis of information which they do not value much. ' >• . *' 

. . . ' . ' . • ■■ ■ . . • . ■ :& 

The argument ' that teachers will ^ make .better use of criterion referenced 
tests (tham they, now do of .norm referenced tests) berause^ they jP^n participate 
in* selecting test items fails in face of the fact that these items ar^Usually ■> 
selected to represent a district's goals and not to reflect what students are 

* doing and • learning in - the classroom. , Tteachers use materials in _ idiosyncratic 
fashions thatr usually make standardized test informatics inappropriate for 
assessing |sfcudent performance in the classroom. Some instructional management 
systems; attempt to get 1 around this difficulty by means of{ intricate 
drossreference and index scheme's, andc d^Miffd. sequence charts and goals 
checklists.- • . . ■ 

- '. . ________ . _„ \ . ; ■ * - • ' . • . • . ; * ■/ 

A third reason teachers do not use standardized test results is that they 
have too ,^any students to work with WHEN TEACHING RESPONSIBILITIES ARE 
CONSIDERED 'ALONG V7ITH THEIR OTHER RESPONSIBILITIES. Even if a teacher knows how 
to interpret test results and hew to trains! ate than into learning experiences, 
it is unlikely the teacher will have the time to do so for every child on -a., 
subtest by subtest 1 basis. Even less likely is an examination and comparison by 
the teacher of the individal test items and responses for' each -child. As a 
result, the teacher must, rely on the summary printouts showing which items were, 
correctly and incorrectly responded to, the frequency }o£_ errors.-- of individual 
students compared to their classmates, scores of one kind or - other compared to 
what might be expected (anticipated; scores) and- to; the scores of norm groups. 
Scanning numbers on computer printout sheets is quicker, than looking at each, 
individual ; student 1 s answer sheet and comparing responses to the actual - 
Questions which were asked. It is also more superficial and further removed * 
from -classroom activity and direct intervention in the teaching-learning : 
process. Even if the teacher is sophisticated and knowledgeable about how /to 
. use the a J|tf ormation provided by. summary printout sheets, othet time pressures 
. such as ^^eparing daily lesson plans in four or five (^rridulum areas,* working 
on curri^um development projects for school* or district, . ; responding, %p :^a?ent 

* and c'OTfriilhity concerns, and working with specialists in order to a y attenI ' to 
students with special needs may weil take precedence. Those . - who-* work ^ hi 
schools, like most^everyone else, do not .always have the luxury of^aaecmate! time 



ERIC 



fejffi 1 ^^P-*^ -^fe^^ used even :$£. they do not improve the 
qimity of what is being done. If the* teacher is in any way unsure of What the 
^ fc results may mean, or how they may be translated into classroom practice, 
there is. little likelihood of their being used.- . practice, 



v ^%7 : H-LUSTRATION 1 . ," v 

METHODS FDR EVALUATION PROGRAMS AT ^fca^RTHY-TOWNE 



' _. - , • . . . • . 'ft M .' ./ 

, 1. Parent information Coffees *• r'fW"'- 

• . a) At school "*••* ' 0S. .~:\--y 

/H: ' ■ . ' b) in neighborhoods . • 2. :• . ff ^r " 

>■•.. 2. Parent- faculty annual meeting . ■ '■ ' ' : %ffiiA i/ 

■ . 3 ; .^?ent-facult^st^ent J surveys ; ai^ ^^idrs^ices/ : v^^ 

, . 4'. System, state, national -pests and surveys ;.'- ; t v 

- 5. Reports, by^r^ate students J '•" 

6. Faculty and school' self-evaluation , ^vAS£r ' "' 

f 7. Interviews of Sixth, ^Severft and Eighth Graders 

• % ■ - 8 * Surveys of Junior High • School faculty ; -,' W. -wM§ 




12 . Survey parents of Children who once attended McCa 

13 • Collection of comments and concerns from public 

.14. Observations of student teachers ■ >. . : 

15. Reactions of results of evaluation data from •/:%?£*'"■■ )\ 
- ., — / all concerned * : ' ' /'J', ? 

16. Videotapes of. school's programs in action ■{'* 

17. Samples of students' work - 

r 18 . Third-party ; evaluators : . ; \L ; ^ 
19. Gomnehts of vistors i ' > ' / 

20 • factions from- other- schools and prof essidhais 




j; 



er|c ■: X ' 70 



DRMON 



WHY DON'T TEACHERS DEVISE AND USE ALTERNATIVES TO iSTAl^ARDIZED^T^S^^ . 

Tfeactier s do create ; their cwn assessment' iYistfurft^ts; and procedures. 
Teacher made worksheets, samples ' of /student work, professionals' anecdotal logs* 
end-of-the-chapter or unit tests •• in textbooks f and many other _ forms of 
assessment (see Illustration 1) exist and may.be found in poor and affluent 
schools in- urban, su&irfen f and rural districts; ^in many school ^ystems though, 
such alternatives are distrusted. As a result, ^dhools and districts operate 
a two track assessment system,, Assessments int^r^iJ to assist administrators 
'■; and school" boards in making policy, . priority'^ program decisions depend on 
standardized tests> while assessments made to^ instruction and 

'. individual student performance depend bri a va£i$€f df 1 techniquil. 

■ : Some people mi§ht call suph ^ iitfb^ subjective, and thus 

morg- suspect, than test scpres, T^ose who make this charge should be 
, ^CTiinded of the* highly subjective'«nature aE test construction, to say 

-s ;i w nothing of the* interpretation and use of test data, The issue may not 
§^r^r ^ :be whether to use these alternatives,. ..but whether the person ^ 
requesting the information trusts, the 6ne providirig it. Right now, 1 
the^evel of ttust and /'confidence between the public- arid the 
i T professionals .throughout the country seems lew, . • • ' v 

■ (Damon, J, Barker, "Questions, You Should Ask about Your 

■ " V Testing Program," The National Elementary 

• * ' PRINCIPAL , t Vol, 56,. NO. 1, September October 1976 

, - v - r i : r p. 53; >also reprinted- in THE MYTH OF 

./' ; MEA^P^iLfc^ Raul L.JioutS, ed.) ^ .:' ■ ' 

d<' Teadfer s • trust _ the results of other assessment instruments and 

practices more than they' do the standardized; tests. In addition to the uneven- 
quality of information provided by standardized tests, they also tend to 
continue other confusions. For example, they encourage the , use of labels or 

* terms such* as "measurement," "assessment, 11 evaluation," standard of 
performance," literacy, skills, V "basic skills, " "hierarchies ?>r skills,,* 

* abilities, and thinking," "knowledge," "upderstandng, ? . "attitudes," "aptitude," 
"anticipated achievement, " " "gradfe • flevel performance," and^nany more as if they 
each Jhave precise meanings that distinguishable from the" other * or apply with 0 

' certainty to both groups pnd individuals, in reality; and more often than not, a 
these terms and labels disguise ignorance and promote myth. {S6e Illustration 2) 

• ' * '• " , :• ' .: ., ,,. ' ; ... . 



ERLC 



71 



mm 



: ILLUSTRATION 2 * .". , , - -/ Vr ; ' " 

Investigative Tfeadiing Model 

.'. _ " " * 

. Students^ parents, teachers, administrators, and matters* of the public all •> 
want to know how v;ell performance, task and situation, and goal natch. But eacft 3 

* may want this information in the perspective of one looking back Ijitd time to 
■ t examine completed performance, from the vantage point, of one observing ongoing 

• activity, or as a predictor of future achievement. Why did someone do that? 
Why is someone doing this '(rather than that)? , Why will someone be able to do 
that? When we look into the. >past, we are evaluating. Observing present 
activity is assessing. In Rooking to the future, we are estimating ; the 
likelihood of a predicti<^ : being realized. Evaluation, assessment and 
estimation are words to use carefijjly, .fiot interchangeably, for' they have 
dif ferent A meanings. > ' •" #if^ ^ 



_ PAST 
r EVALUATION 
; QUESTIONS 

* 

What did the person 

" do *' ;■ ■■■ 

;was : the goal *■ 
reached? - • ^ 

Hew well was the 
goal reached? 

Should the outcome 
have teen differ-'/" 
ent? . 



ASSESS! 
^ QUEST: 




;.. FUTURE 

questions . 




How could improve- 
ments be made? . 

Why should; Ihe ac- 
tivity stay the 
same, be changed? 



is the activity Will tlj§,£erson be 
being performed? able to do the :J 

I • ; 11 ? - ---^ 

Is it being done - 
. ±he^ way : it should' /. How ^ 

*'"'^ f " whiie> jmd ^f ter 

P&p tiSe-a^opriate doing it?'': :/ ; ; 
people materials, . . ' . . ; . 

" anei. conditions 1 'How can we help 
- ii^oiyed?-,. . . . . ^ . prepare the ^rson 
- % T ' * r '?' : V .^i,-: : for doing . the ac- ■ 
IS irfceryention t^vity? Should we? 

appropriate? y - . ;: - / ill .. 
v £ ' ^ 1 I#y^*will one fprm of 

Where is the ad-1 1 : ; . assistant /^^r""- 
tivity headed? v '' than another?" 
How many outcomes? § 

- ; ■ 1 ' ' 1 . ■ . .^1^1 



% 



The ci^feroom ' "teacher' evaluates, assesses, and: : estiiSStes^ all the tfime. 
Standardiz^Hests, chapter tests, and other ; commercially a\c^|f|bie instruments 



some 



the teacher may use to ansy^r^ -^i^^lquesti 



the ' f irst <^urrp^ ^S^dr^have to rely on their , i 



f tailor-made * or chosen materiai^l;l^&^^ ski] 
^trating^in order to pranote the learning ^ituatic 

to be able to ar 



w powers of 



c^er^ation, 
d|f eqt^ rra_ land 

• teacher fi|s to inve^igate what is going on 
questions in columns two and three. -,^ x 



ERLC 



72 



mm ■ - • ■ • 7; : . \-"- : ■.' ? '-> 1 .-' * : ■ ■ - "c- ' 

whet wtti rr tme: to mm ^16 EUBLic f policy ims, and othhT ^ iro^sid»Ls 

TPOJOT TOE SUDGETENT OF T^OIERS MOREr fiND TOE RESUMS OF ST?p3AI®IZED TEST^ 
LESS? r •■ •■' V;; : / V V-',''*' ' > ; , V - • , - 

Attacking the ccs^bili^:'^' stancferdized test - results or the ; test? 
themselves will . npt cause .a "change in. faith/ Probably no single course . of 
action will inspire greater confidence in teacher sV^j is 
likely that a series of concerted efforts, would have this result. -First of aiir 
teadiers need _to know what the^e are talking -about when discussing student 
^performance with others. They have to be iinformed about the strengths, and 
weaknesses of different assessment instruments and procedures r about .hqw -their 
classroom's curriculum cont^rit>; their btudents 1 learning, . styles,, and their om 
assessment practices, and about how their classroom work -supports the overall 
objectives of : the school. ._ If teachers are, able .to. articulate < these^ 
relationships clearly, and if .:, they hold themselves and their colleagues to 
agreed upon standards of performance related to ^ integrating . \ ^sseS&T\ent:/ 
instructing . and schcrcl jjoais, then teachers, are more likely to trust themselves 
and be trusted by others. - * 

' . - *' " ■ • ; • • •• ? • ;" : >v,.'. - • :.• ', . . \ .... . : .... _ 

Teachers require inservice* training.^ level of understanding : 

and- confidence. Few teacher training institutions instruct : teacHers-tb-be on 
hew to 'evaluate . the content and appropriateness of assessment and teaching 
instep Surrmer workshops and released ' time during the school ° y^ar oA f or 

f<£l^uj3^ provide teacher § with 

tj^e knowledge reed^'t^ when axrrounicati|§ > their " 

judgments, and to ajnpiem*^ ^ 

Supervisors, principals, Ji^^ have the 

resgonsibility for interpretiri§^^ inservice 

training as well in order to|||||^^ into the . 
proper perspective by or^asizirig^^ 

• . . ; v • . - 1^0^ \ V^"~ 

.WHAT SUPPORT IS NECESSARY TF$EP&m$^ v^:tJSEFUL ASSESSMENT . 

... jx. _:-<*r • _£h<7'.i_i--_. .'•o-i'v *v 1 • .•■>!. .« '.^i,. 7. __.,_«* 



FRMZFIGES* 3D 3M|RG^E INS^CTIOH? ; ^ ^^^^^^^^^^ . , 

There are different" kinds ~6f ^sup^rt, "differ ent tiiftes support is needed,, 
aiid ^ combining the timing and type of support. Talking about 

" providir§ - te^chers :: 9yith support of various, kinds is easy providing it is 
something - efse. , "Su^»^ comes in the: form of money, -the time%^to do things, 
encouragement and reinf^^ment from colleagues and supervisors, the -flexibility f 
to change schedules^ ah^^ctivities, the space : in which to plan, * operate, arid ' ■ 
store, the services of curriculum, consultants, classroom assistants, and ^ 
clerical aides, the opportunities for continuing iSservice training^ ari^ access* v 
to many, different means of o^rrindnication. These are the types of support a 
district or school can give its. faculty. Few, however, provide more than 
limited amounts of any one of these supports. Fewer still provide any of them v *' 
for any length of time. -%er£> 'are jUst too many . jobs to be done,., too .little. 

^time* in, which to do them, aricl too f eiv resources. In one ttnusual - instance, 
district riot "far* f ran Boston made the conmitment to a ^Ibrig Pterin jnultif acet^ ^ T 
effort to improve o instruction via the continuous use^ of assessment. . (See the. * ' 
NESDEC booklet describing the 10 year Fitdiburg proj^b-.) But most districts or 
schools are unwiiling_or believe toemselve^unable toj^^r ^p 

to provide the support teachers need to .improve .instruction on more than a/hit- - '^M 
or-^niss basis. 



1 i -__^'. 



ERIC 




73 .:. , ... 



8 



. Beyond the schools' support of teachers is that of society. Foundation 
endowments, government grants, regionalized and collaborative . local efforts make 
the establishment of teacher resource centers, inservice institute^ 
and resources exchanges, experimental arid dissemination sites, and information 
networks possible and practical. The boulder representing the suspplfefo# 
3 teachers to improve their instruction through the better use of assesp^- 1 Si's 
; poised, ready to be rolled down the hill of practice. Teachers want * the 
support* administrators want to provide it, and the public is beginning to 
recognize that much as it is in the other professions, improved tools are' 
necessary but not sufficient for long term improvement and reform. 

Many people believe that of all the kinds -of support teacher's require - 

• money, time, encouragement, autonomy, flexibility, space, people, inservice, and 

• conmunication -money is the most important. I ain not *so sure. I think that 
perhaps, encouragement .is the key,, element' to a successful support system 

if^Sfe^.r^i.ife^e fprm of- another professional describing and 
defending what colleagues ^re doing. Encouragement can be the recognition of the 

. importance of a job to be done, the cMitmeht to it .. and the work of others to 
get it done, and the development of a similar recognition and commitment in 
others. Ref ocusing curricular emphasis, changing curricular content, improving 

- instructional practices, , and .restructuring learning experiences are all 
worthwhile efforts most schools are concerned with pursuing** But they cannot 

• a Ji^f ^° ne simultane ousiy, and well. Which comes first, and how to support 
r t3 ^W & developmental activity requires a. long - term commitment , to carefully 

established priorities. In this sense, encouragement and commitment are 
? , synonyms." . , _ 

' Honey is, of course, . an obvious and necessary form of support that makes 
other forms easier to have.- If teachers are to use assessment to improve 
instruction, then they are going to need materials for use with students before, 
during, and after the assessments are made. It is quite likely that many- of 
.their existing materials will have to be modified or supplemented. Teachers and - 
other faculty members Will also need time to learn about alternative assessment - 
practices and tools and abput how to link -assessment to the improvement of 
instruction. *r." •• • -r~.' ..v. 

^:r'^.:,, \ " J,.;,..- •■ -^.t:,, . i ••; l~- :". V" '""■'•<■"<. 

; 5 Time will be required during vacations, at the end of the school day, and 
as a result of being released from regular responsibilities. The more frequent 
use of substitutes or the provision of other forms of "classroom coverage^ (via 
volunteers,- older students, placement in other classes,, or alternative' 
educational experiences e.g.,. parents or' neighbors supervise . Students' 
experiences away from school (creative hooky) are a necessary form of support" so 
that teachers .may attend. inservice workshops and planning sessions. 

Released time during the school day is necessary if teachers are" to use" 

• assessment, to improve, instruction in a serious way. .The time I am referring to 
should not be confused with the planning or preparation periods many teachers ' 
have. These periods typically occur when students leave the classrcan for art, 
musigr - or physical education. Thoutfi they may be used to reorganise the next 

• serifs of activities on the basis of ^hat has just happened in - the\ classroom, 
they are more likely used to ofganizefinaterials, correct papers, - or catch up on 
communications with colleagues and parents. Other faculty are usually not free • 
at? the same time so joint review and planning is . hot possible. Periodically, 
teachers need additional time in order to contemplate the information provided 

.' , , ' ' : ; ; Y\ ' ' ■ , ■ i 



.* ■ • - ■ V 



• . • ' v ..>. . ■ ■ , :• . . • ., -V X _ " 

: t by assessment. Such t^me ; ;sfio^d be : -^ the school week\ To relegate the , 

review anc( use of a|is^ssniefit inf brniafef 6ri to af ter school , weekends , and school 
vacation periods is . to. ' -Mstakehly believe these instructional improvement 
activities can be ^;''^»y^vbe useful/ • £\ . 

(1) Fred5%f|ediinger, ""Slit k Education, - Study Suggests Texts Are Often ~ ' 
Inadequate^ The New York TBES, fepril 8> 1980, C4. " .. 




:; " , • r • ' , '. ."^ *""•/ ." ; . ^J,\ : 

CULTURAL VARIATION AND LIVING ASSESSMENT ^ , : ' 

] IN THE SERVICE OF INSTRUCTION ! •* 

. -' ■ •- ■•■•x: :. . ; . • . "V; : ..A- V:v 

• Asa G. Hill iard ill ° ' 

"Man has ^ v -%3y. -^<v;.v^.^ ' ; 

has so siin^ified his life and stereotyped 5; ^- 

his responses that he might as well be : in , \ ^ ' / 

a Cagei" •;'*•■-;«:■■::.•:. ■ v ' " _ i'.' 

(Hall, 1977) " .^^ ; . : *' :: ^V\ : 

Historically/ , standard^ it has been used in education has * 

/reflected users 1 strong dOTinittment to; sorting children, guessing at or 
predicting children 1 s future performance f measuring "school achievement % and 
"diagnosing" learning difficulties. Further, to accomplish this there has been 
an unwritten but strong demand for mass produced univ e rsal instruments which 
could be easily adnini^tered, quickly scored, and ' inexpensive. It is this " 
peculiar combination of things which has impeded educators and researchers * in,; 
Jthe search for tests or assessment procedures which can be shown 1 to make 
positive difference for learners in th^, ^educational prbdess. It is a ^pityv^ 
^nce testing; and assessment can ^ above all, valid 

without being standard and • iffi^ers^^O^t I most . important, testing „ arid 
assessment, appropriately constru^^^fcf conducted^ ; can- and "should makev a . 0 
positive difference for children in their educatiS. \ ^ % v . . ■ :^ 

" :'■ o 3^-. ' : :-:v'"''r>^ ' • 

AL1 people "swim" in culture. CulturV is ,.the "stuff that' people makev ; At , 
the basic or* "deep structural" level, p^ple all over the^ world appear to^ 
perform^ similar functions, friey construct. language and learn" language/ ; They 
organize .and classify their experience according to the ways that they ^have 
created; They expand their repetoires. to accomodate / and assimilate new 
experiences. They domar^ other "cultured" or people-mac3e things, but the# don't 
all look the sane or do things! in precisely tfie same way. At the. surface 
structural* level , they manifest their v conrionr equivalent hum^ basic cc^petencies; ^. 
in a variety- of ways. . 1 - '.: ; : " . •« ■ • 

A few years ago; the loss of culture phcbia and apademic recognition of 2 
cultural variation in the testing area Idd to attempts to imagine how people 
would behave if culture were held constant, This resulted ih -\ a- "culture -free" 
testing movem&it. ^ As , it has become more and more apparent that the very 
question that an examiner asks is^itself a bit of culture', not .nature^ the goals 
of testing have.feegun to reflect the idea of "culture fair" testing. However, 
neither "culture free" nor "culture fair" testing as we now know thOT^s^ms to 
have nfcch: academic meaning or practical utility. Hie problem is 
neither to do v/ithout that which^ all people ipSUst display - v(culfalte) nor to test 
by providing an equal nimiber of ite^ for each culture or items which do not 
f qvor one culture over the other in the |inai score, (culture fair); Rather the * 
problan for educators is t^^iae ajlture ' boldly as the. medium . of coimunication 
and creativity * ; ^ : °-:Z&: J - ^ [ ; 

; ^t is my irSent to ill^ practice of a 

sfeiliful jiss of ^ culture ^ the^ stpff ^ sv;im, withoiat 



H&fcBtt© 2 

which beinq human would be meaningless ijLnot iinixjssible^ and in_ igribrahce^ of 
which jfedaqocjy is a joke. Ppecif icallyr I intend to treat several key issues:, 

. , . . / . ^ $$f 

l; r rris/ cah ^rkric^;i<^gc *df : culture prevent ia,arjhbst^ : 

■■■■•■•*; ->'- v :^2i Hew can a _use of culture reveal material for more effective 
planning of valid instructional strategies?, 

' 3. Ijw cart a knowledge of specific cultures and an understanding of 
the concept of cultural variation serve as a basis for constructing 
tests which do not confuse quality achievment^ with cultural myopia? 

4. Hot cart the use pf > culturally specific testes and assessment 
procedures assist teacfiers to help children to construct expanding 
repetoires? 

■. . - ___ -- c — ■-- ... . ■_ ' ■ 

5; How can ths — use of culture enable children to o ; enter into 
dialogue with their^mentors and to assume their responsibilities as 
learners - as culture creators? * - *: 

6. .How does culturally sensitive testing and « assesses allow for 
a more valid approach to , "accountability", or, put another way, 
help educators and others to know what has happened in the 
learning process |nd how it happened? 

. € Sophisticated testings and asiessm^j:iW^^ jpsreiBe of instruction which 
uses culture is already irr operation. -ton$^(i^ ; ^M dramatic ^^rrtir^,^^^|^ 
with learners ; who, by z traditional mass; "^d<&6^ tests, would" be <c\a^l4£$ } . 
erroneously as unable to learn much or as having learned too little ;to make the 
next step in teaching worttov/hilev ... It may not be mass produced, universal, or 
cheap, but- it can be vMid- and^ instruction. ' 

THE REALITY AND MEANING OF CTJLTORE " * ■ ; 

Typically, analyses of tests A for cultural bias are accomplished ' by * 

comparing the differences in the pattern of responses of two or more presumably 
different cultural groups to a common set of test items. 'Such an approach can v 
shfec! little light on a very complex matter, primarily because, it takes for 
granted that culture has been defined scientifically. It also allows the 
attribution 1 of a cultural f . -gidentity to subjects by a givjen ' researcher. The 
meaning of culture and the pC^merit in a. cultural group ax^e critical matters" 
for cross-cultural rosearch^fe| S feese matters are^ easy r ^ under any 

conditions.^ This Is 6SpeciaEEy true within^the United States of America, since 
cultural patterns may be either relatively distinct or they -may } J3e amalgamated 
or overlapping among groups, ''in any event, one cannot assume : that 'culturally^ 
-specific data fe handled 1 adequately By those whp have not studied . Jcultuxe ■ 
systematically and professionally. Without, such background, : thet^^m strong^ 
likelihood that the wrong questions will be posed and" cdnf • answers 
obtained. - f'"" • ' 1 

Cultural bias in touting will produce inequity" for some groins- because of 
error irt assessment. . Put i€ is also/ just as important * to think of culture, not- 
as an impediment or threat to evaluation, but, as a primeAallyAn the testing $ 
and. assessment process. Gifl. tibial experiences, are data wluch can be uslfo in 
testing and assessment. Irvfee^. they must be. 



BILLIARD 3 • ;..\ 

: ■ ' " . ft"' -.• " . : ; '*\ 

&t^what dq- w£ ,,Saan : by ^.ciature?; * Mthdu^i^there is much vairiatlert dmdng "* 
-s^ciSiist^ thoirc are coimibn theites which ^ rm^ throagh : 

definitions. ^ obsfervafeicns :o|: ' 

httfhan behavior in natural settings. (Cole and Scrifcner, 1974) (Italic -1917") 
{Levi Strauss, 1966) (Labov, 1970) (Ben Sidran f 1971)*' For example, E^^ar^riciil 
(1977) illustrates something that he calls "extension ^transference ? 5 V^j. It is 
here that investigators impose their order ;^on the reality of other ^^le.: ; 

: : : -"Another frequently dysfunctional character! itickof ET systems is that they . 
, can^e moved ground - andarappropri'atel^ appll<ilv;> •; lhis.-is understandable, 
because it takes years and even lifetimes to : %valop a good Extension 
system. (Sometimes we call then paradigms when thj^^le^a graircnatical or 
r^e-making .or modeling fornix ) In- the days ; foil ofp^^if opening i of,: Japan 
to^fche outside world, American missionaries wrbte- their awi' grammars for 
teaching Japanese -fetjS*' each other . Anyone who hits seen one of these early 
grarinars knows that the ; missionaries 4 projected thei r own, . IndcF-European t 
grammatical forms onto , Japanese without^, any reference to the actual 
structure of \ the Japanese language. Wom^tive, genitive, . dative, (and^; 
ablative cases all appear in te graimirs with identical Japanese words* ; f: 
under >each. ; : A characteristic of transference phenomena is that people will'. 
treat- the transferred system lis the only reality and, . a£ply it 
indiscriminately to new situations. t I once :knew ah, American wqrfan in Tokyo " 
who became so resentful of the Foreign Service Inst itute dartguage^dr ill 
designed to reinforce the learning of proper Japanese t±at she dimply 
f struck, out on her s own. She_ said, "The devil with all these honorifics. 
I'm not going to i^earn . them> I will siimply learn vocabulary. " What she 
, spoke;' vOf^urse^-^as a:' most dreadful, ;iSiriteliigible melange of Japanese 
words and English 'grammar • -- : ' . ■ 

Something similar has happened to. . significant blocks .of social science. 
Not only has there been extension transference {not data, but methodolc^? m 
- is thought of as the jreal science) ,/but because physical science has been 
so successful,, the ^paradigms of \^e wer,e" VvV 
transferred intact to social science, .^Mitfi^^e:seldom, if ever, 
appropriate." * • ~ * : * . ':' 

- , . ; ; ■ * . p. 33 - . , [-./I . 

\ . . t The inability of the investigators to Understand that his or her • own \ 
logic is not unique. has impeded scientific discovery for "many years ^and'U^ many 
{places. Claude Levi-Straife (1966) gives us many excellent" examples of : 
culturally specific logic. • r - " S" " e 

"Follwing /Griaule, Dieterlen arid Zahajj have established the' extensivenesS 1 
and the systematic nature of native classification in the Sudan. The Dogpn 

- divide plants into twenty-two, main families, some of which am further / 
divided into eleven sub-groups. The tv;enty^two families,' one* of^;hich.is " * 

^•.composed of the families of odd numbers and the ;other of those of even < 
ones. ^ In the former, which symbolizes single births,** the plants cailecf ^ ; 
male and feraale' are associated with the rainy and the .dry seasons s 
respectively. In 'the latter, which symbolizes bvin Births, there is the 
same relation but in ; reverse. Each famil^^ds also allocated to one of 
three categories: tree, bush, grass; Finally, each family corresponds to . 
a part of. the body f a' technique, a social class and ' an institution - 
(Dieterlen I, 2) . '• __ •. ;_ _ - ' v r * 

Facts of this kind causfed surprise when they tveje first brouoht back from 



hhJjIAJvD 

r ' ■- - 

* "- ■ i 

Ttfrica* 



4 



(Levi-J3traus; p. 39*7 



t,ctfi-r>traus gives another example from /*merdcan Indian Culture. ; v 



The llopi, like the Zuhi who particularly engacjccl Du£kheim r s and Mauss 
attention, classify living creatures and riaifuteil phenomena by'means of 
vast system of correspondences. The facing' table is based on the 
information - scattered in several" authors, * It is undoubtedly only- a modest 
fragment of an entire system, many of whose elements are missing^ 

" •'■ • P# ; 41i , : :v ' ' ( : " ■- ' 



COLORS 



... TREES 



V • 

^BUSHES 



VOTERS 

S-- • . 

QDEN * 
BEANS'- 



THft XOGIC OF TOTEHiq CLPS$IFICATIOtlS 



mWWEST S005H7EST SOUIHEAST Nfc)RffiEAST ZENITH 



yellow* - blue, 
green 



red 



v/hite 



blac£ 



HADIR 'v- 
.multicolored; 



ANIMALS * ouma 



bear 



wildcat wolf 



oriole bluebird parrot 



magpie 



vulture snake 
swallow warbler 



Douglas • white pine red v/iila-r aspen* 
fir 



green 
rabbit.. brush 
brush/ " v Vv r 



cliff 
rose 



grey 

rabbit 

bfiash 



marlposa 4 

fellow blue - , red 

French hutter dwatfr 

bean • ; ■ * ' : 4 bean s be2th 



*?Mte 
lima* 



1^ various 



P 



: ;w;3hese are only a few of the examples, which ^Sfat 136 given. 5here would 
be even more examples th^ there are, had ^ethnologists riot often been 
. prevented from trying -to kfind out about the complex and consistent 
I conscidUs systems of societies they were studying by the^assumptidiis they 
- .niade- about the simpl$ness % and coarseness of 'primitives' It did^not^ occur 
;fco them that there could be such systems in societies of so low.~cin ec^omic 
'and technical level since they made the .unwarranted assumption th^thlir • 
■.is": lf -. /intellectual level must" be equally low.. And it is only just beginning to 
" be realize^ that the older - accounts which we. ewe ti^the Insight of such 
rare inquirers as Cushing do not describe ^excepticnal" case's but rather 
■* forms of Science and thought which- are extr^hely widespread in^iso-called 
primitive societies. We must_ therefore alter our traditional picture of 
Ihis primitiveness. The 'savage 1 has fcerfcainly never borne any resemblance 
either to that creature barely emerged f ran in animal condition and still a 
prey to his needs and instincts who has so often been Imagined nor: . to that 



ERLC 



5f* 




Igr ned by emotions. =and: lost in a maze of confusion. S 
- " * p. '42. l' ; V-_ v , .v 



:r<xt\ Levi-Strauss should clinch*'' the 'point'. 



i. r 



ibegan •• his study of the. classification of colours among the Hanunoo 
of the ^#ippihes, ^ 

•and inconsistencies. fhese* however";/, disappeared when informants " were 
asked to relate and contrast specimens instead of being asked to- 'define 
isolated ones. There was a coherent system but ' this could hot be 
■■. understood in terms of our ewn sy start. which is fourided on two 1 axes: that of 
j brightness (value) and that of intensity (chroma) . All the; obscurities 
disappeared when it became clear that the Hanunoo system also has two axes 
but different ones. They distinguish colours into relatively light and 
restively dark and into: these usual in fresh or succulent plants and those 
usual in dry or desiccated plants,.- Thus the natives treat the .shiny, brown 
colour of newly cut bamboo as relatively green -while we should regard it as" 
. nearer red ,if we; had to classify it in term? of the simple opposition of 1 
r : ] red and green which is, found in Hanunoo* -.a • > 

'. 9 \ ' \ P. 55. . , ' / ' ' 

- -,'v - - ._ - • • _ _ • _• '_• " * 

Thus we see that categories and classif icatorv schemes are not naiiire but 
SUltiilS. It is the ^ invlsibi^ty ,, bf : ones own culture which clouds the 
perception of observers, 'which , impedes scientific -progress, and which • 
, contributes to diagnostic error. „ s 

;., .-. ' '.;•./.... > . . : . • • _• . - .-a " * 

_ . But let Os. return to a mote ar &culated definition of -culture, a definition 
which, mak^/culture amenable to empirical investigation/ v 



/EO^ety pet Soft oi group <3f 
ot thatf environment has/feer^fc 
conscious effort of people, 
a consequence of the cr^ive^c 
creativities, that can be refer r 



:o as 



:n* into a*unique envilonment. A part 
forces which,, operate i .without the 
*0f • jv^ry Human eitoi^onment is there as 
_ .ew. It. is this' latter part," iusaan 
.culturew To be even ? more precise, 



■» - 



think of the range • of -ereStivi&es^ as including such / tilings as €he 
following: . r v- . y ■..*•■■'■»? 

• 1. teking tools such as: y 
• ; -*-y a; language .* • ..- 

b. lever is S 
' / c; 



d. 



2. Baking esthetic experiences such as: * t 

a. music ; " ~ : >■ ' ■ 

bi: poetry * 
1 cij art ' , ■ . ~ / 

* humor 

3 . Making history such as: A 

a. stories 

b. documents or records 



9 

ERIC 



4 . flaking explanations such as: 



so 



5V fteMng-vaiaes such as: 
• &^%t±ical principles 



6. Making rituals ii£h as: 
, /a.' holidays/ 

b* celebrations / \ 
c. ceremonies 1 

7i' feking^futures siidh as: : 
? a; expectations 

b. forecasts 

c. designs — 

8 \ Baking government such as: 
* a i order of authority f / 
; < b. laws or rules for conduct 



' t ^ j n shortr -- is the * mi ^W patterns pr* configurations of ^all of these 
thin^k vrttLch , cause a group; to be seen as sharing a cultured .Advertisers - know 
-this <?nd are able tb target their sales, appeal to particular cbltural audi ences. 



. For example, & Ordinal rule of postioriing begins with the rank of products 

or brands on / the ladder in the consumes mind. It is foolhardy- to 
, ■■ advertise heid-^pn against the No. 1 product or . brand, _ because your- 
, advertising Jtefnds to reinforce the: leader. This fact of life Is even more 
■ . significant among blacks r because they are more rank-conscious arid they use > 
M^" rank for more 5 deep-seated .reasons. More than whites, .they te^ tb selett^ ^ 
brandy in the No. 1 positions and to ufee than as signals to their peers-^ 1 
and tro. whites;.. ..' \ f • •' "./ 

y , 'ffiey are 'not good prospects for boats. Nor would they fin# much 

identification with a scotch and showing a boat, even if the skipper were « 
, black. And, as George Lois suggests, to them the Cutty Sairk looks like a 
M . slave ship. Thus .syntx)ls"and images are often totally different. . * . 
^ Ads plaqinq blacks in . subservient : positions received more ^negative - 
* responses' fiGon blacks ; white tesponses were more neutral. " 

I ^; «, (Gibson, 1978) , pp. 60-84 . 

Clearly when money matters, cultural sensitivity becomes an- imperative. 
'Businesses seem able to-respond, why not. educators? r v 

■ ... Groups s,v^iy - in the use of their environment. Individuals may behave in 
: close harmony with the particular cultural group into which they were born, m 
the other hand a given individual or griup may have learned/ patterns of other - 
groups in addition to- or in place of t£eir own,'- Colot alone" may not be 
sufficient \tc identify a person *as belonging to a "Black culture." Language 
alone is 1 il%ise insufficient to identify a person as belonging to a particular ... 
. cultural group. -^Compare, for example, 'the culture of the majority of the 
Spanish-speaking Cubansvwith the majority of the Spanish-speaking Philippines. - 




IIILLIARD 



. . Much more could be_ added here. ita/eygj^fc should take "little effort to 
see that every person or group of people w&|fj||g$e things outi of materials 
which are available to them at a particuialip^e and time* Further -, : Qiese 
creativities begin with the accumulated experien^ of a particular person and 
group;- . L "' 9 - : 

Nost aspects of culture for a given person or group are "invisible." (Hall, 
1977) (Shuy, 1976) They are so fully learned and are so fully incorporated into 
daily living patterns that the/ »seem to the members of the culture to be 
"normal. " At times, it becomes hard for members of a given culture to accept 
the behavior of rnanberr; of other cultures as "normal" or valid! Other cultures 
are visible only through one's own cultural "lenses," or. "screens." (Nobles, 
1976 b) Therefore, another culture cannot be comprehended or grasped- fully 
because of the alien observers distorted perception (Hail, '19,77) (Levi Strauss,- 
1966) (Cole and Scribner, 1974) (Ramirez and Cateneda, 1974) (Hilliard, 1976) . 

: Jones (1963) gives us an excellent , example with African and African- 
American music and its critics. 



ERLC 



The role of ; African music "in the formulation of Af fc^/inerican 
. was misunderstood for a great many years. And the most obvious 
misunderstanding was one that perhaps only a Westerner would mak<£ that 
Africaa music ; ;^ ># although based on the same principles of European music, 
. suffers £rm_0^j Africans lack of European technical skill in the. 
fashioning of ^ffis crude instruments. Thus the strangeness and. but-of- tune 
quality of a great haxv/ of the. played notes.!' Musicologists of the 
-eighteenth and nineteenth centuries, and even some f rom the twentieth, 
. would speak of te "aberration" of tlte di^onic stole in African music- Or 
a man like Riehbiel could say: ^There 3^ a signifijeance which I cannot 
fathom -in the circumstance that the tpneS^^uch s^m rebellious to the , 
negro's sense of intervall ic * property are the fourth and seventh of te 
diatonic major series and the -fourth, sixth 'and seventh of ;the minor. ^ Wty ' 
did it not * ; occur to him -that perhaps the Africans were using- not a diatomic 
scale^ but an 1 African scale, a stole that would seem ludicrous vjheh 
analyzed by- the, normal' metihods of Western musicology? , Even prnest^Borneman 
says: "It seems likely -4*ow that the (O^Tinon source of European " and» West 
Af ricaiv .music was a ^ simple non-hdiii tonic pentatone , system., - Althbugh k 
indigenous variants of the diatonic scale have been developed and preserved . 
in Africa, m6dern West. Af feicans who are not familiar with F^ro^an music, 
will tend to become uncertain when asked to sing -in a -tempered ' scale. This 
beoxnes 'particularly obvious when the third an severity steips of a diatonic 
scale are approached. The' singer almost % invariably T^x^to skidjaround. 
these 'steps with slides, slurs or vibra^effects so br&as to approach % 
scalar value. " ; * . ■;; •, ■■' ; :■ y\._ 

_ These sliding* and slurring effects in Jtfrc^Mfericaif music, the basic / 
"aberrant" quality .of a blues scale,- are, of course, called ^blueing" tlie • 
notes^ Bit. why : not of "scalar value?" !t is my idea that this is a 
different scale* r v */ ^ \ 7 ' ■ ■ <> . . _ < > 

v. ^/Sidney Finkelstein, in Jazz: A people 1 s Music: "..; these deviation? 
from the pitch familiar to concert music are not, of _ course/ the result^ 
an inability to sing or play 'In tune / ey mean that the blues are a non- 
diatonic miisic; . . * r • V 'V' " e •. : \. 

- ppi • 24-25.. -% : f--: :■• . ■■ ■• .*. m "•. . 



52 



HILLlARn . • 8 



Jones goes on to interpret ; the 'distortion. : It is ah imer^ 
is very familiar to many Miericans who enjoy .the e*peiimc& 
one culture. ~ / . . V.v*v~ : ^ 

There are still relatively- Vciiltivated TTSBtefffer^ Y#hd beiieve_tftat ^ n 
before Giotto _no ^one - could reproduce the humem figui^well^ or th^^e^ 
t- Egyptians painted their figures in profile because' they-; o^uld not do xt JuTy^ 
" . .y other way. The idea of progress/ as it has infected ">ii;*iier;^.eay^iE^ 
Western t±ought, is thus rarried ^veic into the arts as* well, flnd^^^ — 
Western listener will criticize * the tonal ; and timbral qualities^ oi ^ri, v 
/ Af rican or flinericah Negro , singer whose singing- has a leoin^eteiy 1 
as the "standard of excellence." ^e M hoar^^ /shrill" quality x of 4 Af rican 
singer sot of their cultural .progeny, t£e blues si^ attributed 
,to the i r l a ck of proper vocal training /instead of to a cdriscious desire 
.dictated by their own cultures to produce a prescribed and certainly V. 
- calculated effect. A blues singer and,- say a Wagnerian tenor cannot be 
compared to one another in any way. They issue from cultures thatehave- : v 
almost nothing in' cgninon, and the musics they make are equally alien, 1 - The 
Western concept of v *"beauty" cannot be^ ^reconciled to African \or Afro- 
American music (except perha^npwt in the twentieth century; Af j^Merican^ ^ 
music Jtias* enough of a Euro-Ariet^^n tradition to jit^e it : §^^: ^P^^ tp 
judge it by ' purely . Western st^ftdards^^ Shis is> not quite' true.) For a 
Westerner to say that the Wagnerian teijor's Voice is: "better" than*' the , 
African^ singe's or the blues singer's is anaio^pu^-ta^a non-Wefeterner 
' . disparaging Beethoven's Ninth. Syir^Qny- because if wasn't improvised. 

. pp. 29-30 : ■ 

Tfc^a student of* music who is also a student of culture, r the conc|p^of 



cultural variation will be easy* to grasp. TO the student who" is ignorant of 
culture, both his pr^fier am culture and that of others will remain invi^ble or 



> By way of further illustration -of this, another quote from Mr. Borneman: ;■' 
. ■■ _ '. ; ^ a- , \ . gj^ > v. •' _ "« : 

"While ' the whole European tradition strives for regularity of pitcfi, 
of time, of timbre and of vibrato - the African tradition strives precisely , 
for the negation of these elements. - In language, the-Mrican tradition 
aims at ci rcumloajtion rather than at e^ct definition. The direct 

statement is considered crude and Unimaginative; v the veiling^ of all 

conte^ considered .the criteria 6f 

* intelligence 6 apd personality. / In ' music) the sane tendency towards 
obliquity and ellipsis is noticeable: no notgKis attacked straight; the 
voice or instrument always approaches it front above^or -below, plays around 
the implied ' pitch without ever remaining any length cfc time, land ^departs 
^f ran it without ever , having connitted itself to a single meaning. The 
^timbre is veiled and paraphrased 'by constantly changincf vibrato, tremolo 
and overtone effects. The timing and accentuation, * finally, are not 
V stated, but implied or suggested. The denying or withholding of all 
; signposts. n : ' .• ..r \ - . ; >V : - " ; f, 

_ jRik rules in music are culturally specif ic, ^^iking the cultural products 
would mike no sense* A "norm", is a meaningful liferent here only within /a 
cultural . system. The same principles apply ±o linguistic differences and to 
:detfices which depend upon lahguagje such as paper and" pencil tests v Typical,. 



HILLIARD 



cross-cultural observations by cidturally' untrained 6i)seryers'; results in the 
denial of -data. It r||y also resultf in -thg, errdr^df i^er|refeing; . the/,(^tui:ai. 
substance of^one group ill terms of the ^itural sub^aride ^ ISchwalleir 
de LUbiczj, J977) (NoBles^ 1976.) ;ifte £ matter t m^'becxmie "'even more cbhfraedjmd:: 
cdnf dunc3ea;wferi..we understand that members of tavro different cultural grm0£ mas^ 
) in a particul^i instance, exhibit virtually an identical overt behavior • Ifet . 
the meaning of that behavior can ~be different f ot both peopl^^^ * j f ' 

. v>;^ture is real. [ It -is ^B^?esented ..tig-Mi "particular ^roup hj|stdry and 
present configuration, and it can be ignored drily ; at peril, to tai'e truth.. ' 

STPJNIDARDIZAriON IN THE FAfi' OF CULTURfiL REALITY 

tain are hot 



The overwhelming majority .of test and measurement 



specialists , , in the study of culture, and are insensitive to , gross 'sources of 
variati<^^g -gilt experimental settings. This is ' an academic failure, -. which . is 
reflected in -tlrej? ways: r . r f > 

1. : Among standardized test makers, there is a general ignorance of £he 
■ literature about the investigator's own culture as a culture. (Hall, 1977} 
1976) . :, : ■ ft " v v.r . ;~ 



2. . Among standa 
literature which 
than the invest! 




td , test makeres, 
ribej ■ the cu 
own. : 



is a general ignorance of 
of specif ic . cultural, group other 



■ 35 -. ', ■ 



» 3..« Among standardized test makers, there is a general .ignorance of the 
literature which provides a metalanguage for cornnufricating about cultures 
that are tested. (Hail., 1977} (Levi Strau&sv.. 1966) (Labov, 197-Q], (Chansky, 

■<: i?57). • . .-. ■■ - : -"" :a . . ■ : • 




'•/* 



ERIC ... 



Considerable attention has been giv en t o language. In this Srea, 



the defici^ ^theory appears as the conce^ 
childrenof rom thg ghetto- area receive littl 
to hear very I ittle well-formed language , 
in their means of verbal expression they 
- do rot know the names of r cdntii§n objects, 
logical ^thoughts. m . t v '.V, [ 




^verbal deprivation" : Negro 
toi; stimulation^ are said 
result are impoverished 
ak ^complete .geptefnees,- 
pm^soncepts dr. convey 



"Unfortunately, theses , notions are based uppn^l^work^of educational 
Lsts yho know very little* about languag^SnS erveri less about Negro 
^children. The "concept of verbal ^deprivation -^has ' no basi% in-scsa^Si 
reality : in -fact,, Negro children' ±i4 the, urban gHettos; receive a great deal 
of vprbai st:imulation f hear more we^l-fbrm^d sentences than middl e-& ass 
' children and garticipat^, fully ^ip. a^^hly verbal^ culture ^ .have, the 
fsame ba^ic v< 



and. use the 
English. * 



same 



. ^h^nptipn ot 
b^tholqgy of^^ra^Ibnal 
:tend^fcq e^^M^t^idly 
1 i^guis^^Hha^b. * beeP* as 




>ssess -^pevsamg capacity for .conceptual learning, 
any ohe J^lse who learns to" speak and understand 




privatiOT" is "a part^efTiie most modern 
, -T-^^B^^^P^i: of the |^dunde*-4agticns which 
in/ dt^ l^S^ohai system. Jpi past d^dades 
. , guflty as _J\ dS^s iri 4*OTdting 

- ; a^hiorts^ afc.^^S^ekpensg^ of both teacfi^r^ and children. 



:.\ 



verbal- deprivation 



is particularly dangdfdus f fecause 



such 

. But the tgyth of 
it diverts the 



S4 

■ S 




10 



sfcall s£e* yit -lead;;i^ sponsors imvil^bl^o^^ 



& attention from real defects of. bur -ei^c^tiaial systeri to liniaginaj^'^^^i^- 
of ' the j$gldt ; an3- a^t^ sSall sfee* Ut^l^d ; sponsors _ihevitebl^,t^$^ 



hypothesis .of 4±ie ^^p-i^v ihf er ibri ty: of 'Hegrb , children .which it was 



^ "Linguists^ are also in an v c:<cellent position to assess Jenserr^^p 

; 3 ^ that the middljPclass -white pbpui^trOT is superior ; to tte : working^laBl%nc] 
Negro, populations in : the<* distribut^n of - "Lovely 11" or "cbnre^ual" 
intenigen^ capacity 
- for conceptual:, thiM 
language^ for even the simplest linguistic rules we^ discussed; above- involve 
conceptlial. operations, more asmplex than those, used in the .experiment cited < 
by Je^iru Let*- us con§id§r i.n thfe use of the general ^ 

" Ifr g£lfe ^me^t S|^^ 

\learn and use . this rule f one must -first Identify, £he .class o& indefinites , : 
VV >. ^r^^lved,? any,, one, ever ^ which;' are, formally guit^' civerse. Hofo i$ this-;. 
' ■/ done? These indefinites, share a riunb^r; of cdrnion jpi<p^^ties which qah . be 
"J. e^pressed^^ 

\;Qrfc mighV argue' that these.- in^f Hiites axe l^rned Jt .simple list by ...^ 
"association" learning.^ &t th0 
~ " invplvii^ indefinites t- known to ; every ^pekls€fr. a of English^^hich ; ; 

■■<",-^.-: could not- be . learned expe^^by an m cordon r v /aBstr act" "■ 

. ■"■ properties. -V^t ■ . :> • ' *V /■ •" • * ' ' V •• r ■ . V:'^^ r 

.....v " . . -•-%./ - . (Labov f 1970 f pp. 153-186) , ' 

In .this case the metalanguage would help the^ observer to focus' on^ tfie:l.ogic 
. the" discpurse ^ath^r than upon tiie s^dard|zation of cfentent. " Ip would , K - 
enable false deficits, to be ' cortectly. identified as such; It "would also .enable k ^ 
±0rity tdt)e idenfc|fi^d;as Such. * : * ^ V*. - 

■ --i " ^ : ■ : ^ ■•,..-•;>" : '""^ii"''-- v » 

' In every learned -journal . on^\ can fiiW examples .of vj argon and empty 
elaboration; and ac^aints^^piit it. Is the- ^atorat^d^ ^ 



*false 



nstein ri 



sp "flexible, detailed ^d ^utrt:l^. w as som6, psychologists ^ 
ieve? "(Jensen- 1968? 11^. Isn't jit ^^turg^^^ 
and enpty?: Is^ ife' hot sinply ,.an elaborate<j|btyle f rather than a suj 
<S>de» or system^i 




93- 



^ Our work in>«ie speech COTttpaiity 'makes it painfully obvious that in 
fmaof. ways! working-class speakers ^are more effective narrators, r earners; 
' andcdebaters ;. than many piddLe-ciass speakers' who temporize, qualify, and 
"^l^their'ar iis^elevant detltil^ flany academi^wri ter s 

1^ 8 * themseivesj oli^tiia^ _^rt ^ofi ^p;^|-class style thaf^isr empty 
.|lehsion, .arid keep that part : _that/ is neefed for preOisiOT. &t fee ' 
^r^c^age^ v;e encounter r&ces no such ^ort; he is 

7^ erfieshed in /yefSiage, tile ^ctim socioling^istic factors" beyond his 
control. ' > 0; f 
.";v,^ • ^ ^; V <f (Labov^ 197Q w ^p. 164).;; 

Jhbse who ar^^gnorant 6f ifie f^Tncipl^ of eultui^tehd to *w6nit certain 

;errorsi * • ; ^ L ; 

- . - *. »-..*.. * 



1.; -T^e^ dismiss- talk of culture and.^i 
ideaiogy^ politics; . or sentimentality. It mi 
sin^arily and without, data. 





ipn as matters ^ xheloric> " s 
noted that they' do thM ? 



ERIC 



■4 r 



11 



2';. iliey r then, proceed in att^pting to force cultural, realities into 
standadized r Vp^^ a ^priori/ categories or <^asif^||oriri 0 

begire. to. suspect" more politics than science here, e^cl^ly, ainong the 
prof essioial ; disciplines; For it is cl&ar that the power structure of the 
present -standardized testing community would shift dramatical]jy if the monopoly^, 
of ±txo psychologists over school assessment were" to be toggled, aaturiU 
„ anthropology and sociolinguistics, amorig< ota^ academ 
have thfe- tools to remedy this deplorable condition of ^ cultural^f igrior^ricei 
(Shtiy, Ball, «Labov, Chomsky} But these and; other relevant -disciplines : are • 
virtually bar red. from the area of school assessment • afoeir knowledge ' ^ree" is ' 
Virtually jabgg> ^ (Hilliard, 1979a) It appears that, among standardized " test^ 
^ i d h o f c ulture are not debated jat ally Doing ' so is l ikely t o 



i^suit^in one ot three things;: 




1. * A confession of iqhorance of relevant empirical 'data. . % - * 

; 2'.> a revelation of ;knwledge of relevant ^r^riq^l data/ but with a 
deliberate intent -4nd 'calculation to conceal that knowledge in-order to ' 
v.deceive augienc&s. ■■ f-*A • ; . - \ l - ? v 



assessment practice* to* accomodate empirical 



■V 



I \ 3^ An adjustment g£ *^pre 



. ^.Pm^der the ;$^cwing example* which illustrate^ the fundamental threat; to" . 
'^ e ^^ ^ l^^ 1 -^^^^^ .posed, when standardized testing v;hich regies or^ language- 
i~ »3L^,<,^ -jj^ c^i i^uftiinate clearly 1 the folly 'o£ aggregating test item 

}jthe linguistic meaning, is variable among cexarainee^. Roger Shuy /. 
/h hew ^learning to : read is related to linguistic features which * 
*rse linguistic communities. : " ^ 



Practical 



rience ^Indicates -that different levels of language jnay take 
^^^tienre r : .a^/dif f erent J^a^s in rthe progressioi jOf^ t c, reading skills. ^Thus/ 
eSsi^ ;/ sound ^s^febol correspondent*^ may. be relatively important for ;the; 



_ _ftgga 
rneanings 




Margaret 



^totunica^ 




proce 



^•^eadet.^ irv the, beginning stages of reading, but ttfey becqpe less ^mpprfeant .as 
*'r^s^2fe ; and % "sShiiiftics become m<$e important J '.-v- " 

/ (h 

specific!;,^ 
reading abil it 
• comnunitresrf^ 



nd e^s^rse are ail ^cua^ui 
ext 6i "& stapdardi^^' teste* of 
things f torn varibu^SHLnguistic- 



; . I We are .urged,* by- Pfag# tor* believe 
situation gives Ik^a deep insight into the 
is h^id 'to 4 , be ' o^feh^W iS caiiposQd^l argel; 
>fe^sayf the chira^d^\not. appreciate that 
. wn.po^itiOTr ^hev ^ces it; to ' represent 
'woridjas ih, Really is. Notice that ^is'^rp^q^ a : w 
' .discontinuity " TVny ohahgc in position means abrupt d 



§cc^inf\and comparison\ of -perfpnnances where 
scientlfio irrespoh^ibilii^. . . ' ; • ; f ; 

rates _. this ' fallacy Jp . a .- mismatch ^between the " 

of the examinee; ^ 
;. • '•. _ ■ 'J- 
* s. behavior in- this • 
pld.^rtiis world 
L^tis". That is ^% ; 
'relative ^ his I ^ 

reality .- the / 
rked by^extraSo ; • • 
the wor^S ai^ - 




ERLC 



^ 12 



.break witjj tb^-past, T^d indce is htw 

for the yotinq^j^ild: that he lives irr the state of tl?e moment, not. 

how things^ wer^ just previously^ with the relation: 
of one state to v those which come before br after it. "- His%6rld is .like %. 
film riiri slcwly> as Piagei s&ya elsewheri. * & ^;Y ; " • " . :: -f. 



V? bpthpring; himself \ wj 



;. . ; :^Biis is by no means* to say that Pic$ge t thinks the child has no -memory 
of the earii^H "sfeillSj^ffiej isffee foifpiaget is ha/ th^ states 
are linked, or fail i^^e linkecjf in the. child 1 s mind. ;The issue is how 
well the . child can deal conceptually with t^e t j:^^siti-ais between them. 

Ml t^is ; has, ^far-reaching implications f or .Jthe child 1 s ability \& 



"thiTik ^ariif^ason, - and^e^hali-^eome-baek- 
first let- us ^c^risid^r hew children perf b: 

]k1l^ t^sk^and* in other/ extremely important ways very 




eseg^^i<^ta^s=l^ta r . B ut 
a task which is ifir some ways 



_: 



f 



l3his- task 1 was_- devised. f by Martin /* Hughes. In;, its simplest Ifopnii it . 
HfiaK^^^j^lC two: Vails 11 intersecting to form a cross, ;and two * small ;'dbllsi * 
/re^r^ntmg ^respectively *ai /policeman c&d a little boy. .. .In the ^tiKSies * 
;,which' HU^ies cpndu|ted th^ goi iceman was placed 'initially as in th^ diagram 
r sp that he coul<3»*see the areas marked B and D, whilfe the areas A" and :C;>^re" 
hidden f ratv him'by -the - wall * / - . 




K Hie .childl^as 9 then introduded to the tasfc • very : carefully f in kays ti 

; $§re designed ^ give him every chance of i^a^t^tanding the situation ' 
-ahe» Gasping >/hat was bei r ng^asked of.' him. Fi/str Hughes put the boy doll 
irrncciittrt A j anc3' asked if the pplicejnan couL3 see the fc>oy there^JS*e" 




QtienLiori^Ss re 



proje ctions B f ^C f ^wd D j ir 



Mi. y$e$t the- policeman ; 



was va§ced on - the ^p^ i ! P^ ijfe> ^facing the wall tnSt divides, A'from £, and 
/the ^tihiid was asked to "hide the dolljso that the policeman .can' t see fciijU" ; \ 



0 



; If the ctfild m^S^any ntLstakes at^Hiese prel iminary stages; his erroFwas \ 
pointed out t^him, 'and the Jjuespon' was repeated until the^correct answer^ j 
was given. ^^^t^ J^y mistak£& were ipade. \ ^ r j> : \ v \ ^ ; y - , ^ 




r 



jji^en. 

V : Then", the test proper bjgao. An3 now the task was^made more 
A»o^her pdliconaij was produced and the two wer e^@si tioned. .< ( s 





>3 



c 



3 ; . 



■«i J? 

>* - 



. i 



I 



The child .was told ,to Side the boy f rem both policemen, a result which ^ • 
could. only be achieved by the consideration and coordination' of two 
? - different points of view. This) was repeated three times; so that each time \ 
s a "section was lef t as the only -biding place.- 

r^^lpsults were dramatic. -.When thirty' children, between* the ages of 
„ ree^jmd^a-half and five years were given this task, 90 percent oif their * ' 
responscs,.were .correct., And even the. ten youngest children, whose average' ~ 
age. was pnljp three years nine months, , v achieved af success .^ate of£8 
percent. A v - . ' " 

' - / • • J?5h es went on# further trials,' using more complex arrangWits\ 
of walls, with as _vroany7as five or six; sections, and introducing a third ' 
policeman. Hie thjee r yea'r-olds had more trouble^/ith this, but they still 
• ' ."^^"^^-IS * ^<»_ofthe trials correct.>The four-year-bids feuld^till 
%^#^ cee ? a ! tfie 90 percent level.. , .:. . • 

Jrv these' and^other studies ' ^^I^Pppnfehe J •critical rale' of ; • 
. ^ >clea^>sham.v Stanfferdizat WO&s ^epen^n&^pon a comraon>3rangUaq§ ■ 

between <e mnir^ or^te§ts |^ 

• ;:. \ any event.? , the question^, ^e7/£&ldren jwere" answering were*- £ 
f rSjuentlyjafift ;^e^estions the ex|feri^«fer nad? IsJced. 4htt*Shildrten'fi. 
ftf^f ^etaff^^W: »^ ; q6rr^ppnd $o tfife^terijQgniglLls 'fnlentioh: nor 11. 
eould>they be -regarded ^as "normal oiyen .thjs rules of the language, tfhe 
children did ,nptf know- "what the experimented meant: and one^s temptedto " 



somfc 



-» . '"'n'i "?~\ ---.r—r j-iuamct meant| aiiju une^a terappea CO 

say that^hey did n|t strictly appear J# know what the languages&eant. pr, 
i£J*attfe^>H<». Jetong, one rau^it least say that sanettifcrtoT^ ' 
■ tfts^rules of the language^ w^gshapjjig their interpretaticyi 
perhaps; like all expael^t^r <$£%?yxt gyestibn that woul* 
Expectation that ;|ccjjkd <<be iriflWed by Ifhe nature of the 

ateri^; Mg&§c-, .JTisiesserifeiar »• notice t^at c ^ 

chil%ifehWere, in some general way, no^ bothering v tb attend to"the^ 



• ; Ngnguage -**!br v^e musfc recall the dramaMe ef ffictf in some«f - the studies^ 
, ^^6anclusi*onr or mTmm: <M _a sirigje.ldject^^p : ''jjr**v' • *t J ; 




ERIC 



5- * 



M, 



03 



illLLIARD 



14 



- - It should take kittle imagimti%T to see that if e^iner : BxA:^sed^df^^^J': 
£ ran the saine -culture m$y still . mipi|icfer st^nd each otheir^ the prbble^ is: * 
exacerbated in cross-oiLtutal settings^ ffie ^rops-OTltural setting is^ most 
frequent .where^M^ children are_ CQricefned r 

sinwinany if >nbt most of tfte expiirters r ; and the tests which, aire used with 
children are quite alien to their experience, v 



*in shdttv if : assessment, interrogation' and interpretation" become? ^heavily 
ident uppn^the surface # features of a particular, culture, systematic ^ 
assessment 6^fi still survive. ^However , mass production of testing instruments, " '. 
in this case/ m ust be discontinued ^ f^ss production (of • testing ihstgijnmts will , 



be appropriate for use with all cultural groups when those instrym^ts are able 
to tap "deep structures; " * • • *T..-.i * * ' ; 1 



WHY. CULTURE IS J USUALLY IGNORED IN TESTING AND" ASS^SMH^T 



* While culture ' is 'real and i^^najor variable* in human experience. United 
States social science which suppo2ftPitandar$3ized ' testing seems not to .have 
caught on.\ ^here appear to be several reasons for' this: 




1. Culture is "^^feibre" or out; of consciousness for njostr of t^ose ^o 
have not been traln?£^ perceive it;- * 1 & ^ - ; . \ '^i." 

2. : The popular- label^ which \ are usfed * to identify cultural ^oaps^e 
almosfc^lways ,> cohfounded^ - They, are not precisely defined jteras . In fatfc, ~ 
Hiejf • are frequently' undefined.. For, example: * " fe ' A ' 

a. - _ H Rac£" iis ^not^the .-same ;as culture.; Tfo^gf Qrfey Sl ^^5^ 
"CaucaSbid, " v "Negroid," or ' n riongoioic5 f f if, they 'mean anything . • A ~ 
^ (Barzun, 1965) Benedict, 1968) (Monta^ j974) : do fiot defin^ 
patterns. 





b. \.- eeogr^iy? is nptif^ival^t ^tq^guLture; Therefore, fee; 
'"Gerinan, -Asian, ** dt^!M^xicaIl• , do not define cultural patterns. 

^K^l Poyerty i&, not eqawcSient to'pilture. Therefore, the' qondifeiai 
*~ ^f 59 _^^P^9?§_ i?^^?^^^^- . ^ Hnot define a cultural pattern, 
Soci6econbmic statj!is ; (.a^^^ndt cultural 



"Jewt" "Protestani 



Religion _ is/not equij/sl^^^o cultyrei %erefore, terms such as 

"(^t^b^cj^do not- def ine cultural patterjis. 



^ e # "Minority" not OTuiyalenfe to rNsrul t ur ^^JOti eref ore r the/ 

eond^ d^firid a cultural * ^fettdirn. ^ < 




ural) group mav to change \ts name.f This may ^add to gn 
onr For )ekpi^e,-are M Cc^ 



3. J : A given c 
observer • s conf usionT For ^ 

H can-Mericans w the same iru the United 



9f ^^Krica? For example, 
W in a^lacl;" sample for a 



dcs&nari^ al i^ ^observer dete 
^ ilturaiv study?" * >■ 

te worfl w "culture^" whiQe^ideiy^ i^sed, i^ rsot(\g^d with prj^i^ 
to s^>^fe idimi^siai^ nc^ 
in Blind Jby those who use jcixe swords Further , different users may 







HTTiTiIftRB 



±5 



, diffe^nt waists to the dimensiajsi ^Fbf^. 
w cSt^e?^W^ie ^meani^. language • 'and .-cysfe^S 
^otKej: peribji Isay ^ciiLture^" w^ile iti^^fngl^^^ 

r boliefs ;or>. word .view; /Ihere ^are ^ne wj^^ik^^ 
jslpries of ; .wi^ ; pr^ 



•izel3,y it:iiTtrc 



■/;. 



.. _o • •• 

-|>rpceGs.es which / were iteehded' to fe^ universale . i 
becomes more costly. 



6/ -Since the unsophisE 




i dentif i^atiln of ^a^tm^^m^i \end . ta 
n^mr^ i: ^^rv^rs tend^o~cv~ 



nclude that culture is of little importance, , ; Fbf,ex^pit§/ if ^ac^'J is . 
equated to culture; different cultural ? groups ; ifiay ' aqtS^ly be grouped . 
together, as if .they were the same for research purposes, ^ . r \ ; s 



7^1 > ttie politics of a situation can function to' shut 6am poSraunicatioh . 
altogether. Ah -observer of another vper son or, grob£> inay see that person or 
; groyp as a threat f ' or may have ^vested^infie'rest in the exploitation, of 
- ' ! f ^manbersro^ other ^ cultures; then, the picture of the other culture which 
?: > "/'emerges "faill tend 'tp be° a self-serving rationalization of thecal ien 5 groups w 
'. i v v : : cultural - reality^ (Pearce, 1965) (Stanton/ . 1960) (Weinreich, "1946) ° 
\ \> " (Hiiliaj:/d f 1979a) . * \ i * 

1 CULTURE ffi ffl IN TESTB3G AND ASSESSMENT f : ' 



We shqufd be able to see ^afc;_thi^-..^^^t £ - that_all access \o meaningful 
of hum^m^ is through cultupi^ "lhat . 

;'t^i|^tp say, whatever the -uricter lying mental function 'or process which ^ts^being : 

Assessed, fe^gfe* deep* structures ag in language (Chomsky; 1957) (Labov,^1970) it 
.I; <San only Vbe manifest through the specif ic^ cultural material possessed by a : /' 
p\ : "iSafner^ Jit 'is, therefore,, a truism to say that no other, opti^presenfe itself 
'> •■_"to,.us.-at/this timei * ! ,> '; ' ; - ' & t • > + - * ' 1 ■ 




mE^^ANING OF "TEST" T v >J V V 

. l ?5 e of tests _in| ^eSidation have become bqth arbi tr a^^TcflcitiSj^iic / 
.5^;^es^iB§^ said to be a^ltrary because fee- link Between "testing " : ant 
lr^ructicxiai inprcvement i^S^aji demonstrated. \ They afe . ritualistic fe\ the^ 
f seflsf that :.; testing ; is._:an ^tiH feel compelled )to» v" 

I perf Om. # ; Ye# when- asked^Vhy 0 they- d^|^ ( ^they. beSfxne inarticulate, . prone to the b 
/ of cli|hesj % and tend tV f o<^f 00 irrelevant isSpeq. (Hilliard f 1979b),. 



ual meaning of ^te£t," 



cte^tic^proper|xeB 
Urces of variation are controlls*. 

."tost" jtyc \pte saline. coQtent^pir water, is to knj 



salts^ as 




atidtt can ^be_ ^ed^ed^r^dii^if we compare 
sics.or chonistry f \&tti the me^lrig of "test' 



I /'Tests^in mysics or chemistry or medicine are performed 



urement instrumeots are well ,krtown -and v;hen 




th>the properties of 
^as.thcf nature b£ thei^i' nt er act i on . To "test" 
the behavior of bloo^jinder healthy f and sick 
it m^7 val^Vith speci'E icr types ofc infection. It should be 
Ctest" in education is-^K3t nearly' so precisely defined or employed., 
^iests" ate p^riyj.in^a" to professional practice'. In l:act f ore?! 
whether the* .state • of the art in testing or instruction is 





HDXIARD ' .16 iU^r\-\ • 

. : : . _ • .-^Mi ; . . ■ , ; '. ■ t 

-.sufficiently developed and sy^^a^^tb justify _the # uie of lli|^em "test" in 
its more traditional scientific sense* To qualify _as a "teSt^^cg^^ 
is required, over ajid beyond the sim^a^ppteria of instrument " 'reliability, aricf : 
"predictive vali^ty;" Pre^ction^^if hot explanati^ii festihg , shbuld 
rohti^btAe^^ If not^ 

^exariinati^." A tEue tgst should refill ^with clarity some reality which wou|9f 
be objure or ambiguous or invisible "without the* test. Perhaps the .wor# . 
. examination should be reserved for ^p^uiry which is designed to \ determine if J 
certain skills and content afce present. Ttien the term Jtsst could be use4fd_r 
th ose systematic inquiries' which are designed to render information which 

K v > \ explains" teachxn g~an d learning^ us-whafev — Testsr 

* should r t0li us why. " In any event, there are^o very_d|.fferent functions which 
assessors perform that require quite different designations, if confusion is i:o 

■ . be avoided'. . v - * * . % . 




Hfi-LIARD ' ." 17 



1 IT_IS_fflPK?TfiNT T0_EXP:3ESS CERTAIN IMPLICIT ASSUMPTIONS UPON WHICH SYSTEMATIC 
ASSESSMENT IS BASED. TIESE HOLD ESPECIALLY FOR STANDARDIZED TESTING. • 

*\ ; •' ' SYSTEMATIC ASSESSMENT WILL ENABLE COMPARISONS /TO^BE VAdE AMONG 

' nrownuALs and groups 0 , these comparisons are based upon a criterion 

46 , or. Criteria which have stable meaning across individuals and or 

• £ " GROUPS. . . . v " : 



SXS_TEmTIC*ASSESSMErTO IN EDUCATION IS EQfJIVALENT TO •:M^pffifmiT'^fll''.TBE*.-| 



. PHYSICAL SCIENCES. 

: THE AGGREGATION OF/SCORES OB PAPER AND PENCIL 

AGGREGATION OF."; COflPERABLE UNITS OF , B! 
■_ BEING THE 5.11E AMOUNT. OF THE SAME kjND OF 
IN ' BIS HAY>s^HE AGGREGATION IS 3MKR 




£*• TEST 




S THE SATE AS THE ' J..,-*? 
ITEM IS SEEN AS ' 
IF THEY ARE NOT- " 



S! IOULD I [AVE UNIQUE \RlGHT ANSWERS^ 



TEST^DATA ^ GUIDE ISWUCTIONAD STRJfEGY VALIDLY^ "liE. ^ -INSTRUOTIOfSWKL •• % 
BE BETTER^ECAUSE: OFtTHE.USE" CP •A^ESSMENTi 

It 



TEST ITEMS' SAMPLE ADEQUATELY. THE DOMAIN WHICH IS BEING EXAH-It-IED, 




VALID INSTRUCTIONAL STRATEGIES EXIST" WHICH RBOUiRE ASSESSMENT DATA IN 
ORDER TO BE EMPLOYED. • - ' * " * ' : 

.» THESE ASSUMPTIONS CANNOT BE MET IN PRACTICBC STATISTICAL PROCEDURES 
ISING "DATA" ARE HIGHLY .SOPHISTICATED. , YET, TDATA" EX5R PROCESSING ARE« 



• .SUPERSTITION WILL CCNTTISE- TO PREVAIL: 



OR CaTFOUMDED.- If THESE 'AJSSUMPTiaJSJCAIHOT BE M£T, RITUAL AND 




ERIC 



HILLIAED 



1« 



A PARSDldM FDR SORTING b^SdjSSIOH ISSUES IN'ilSTlNG 




s _ Ail abundance of issues, in testiriq are frequently lumped tc^efelw in 
dxsc^sidns afid analyses* 3 Hiese must be sorted out apd discussed .^e_at 'Sjrang^-' 
..J ^^liif- clarity is t<^^taim When speaRing of, te^ 



^^^^^8f test, the au<pi|ce or user of the ihf oration and" the type of- test being useS^% 
^o.-v^,;' be identified, ;^|c^sants most, in general/ talk in one of the cells of the 



paradicjm at a mHffi or discourse v/ili be cohfounded. • 



m 




This cube represents 
discussion ~of an I^Q.^test 
for sorting children where 
the information is fpr the; 
child. r '* 




grp le -2 



cube - 
represents the 



^cuxavierasnt testj 
Be^l^ts^for 
sorting childrenl 
whlfW the inf dr^f 

ic'v 



19. 



,. 3he culturally . sensitive use' of tesjts- will depend uponiclarity about the • 
'• ^iusesipf ; assessment . and the user audience. y Fdr example, 1.0. tests; are used as 
^ sort^^' iiitrumenfes by administrators. Yet the myth prevails that ^ey also are 
* -. us"^. a0^a^bsj:ic devices for .the development of instructional strategies; flie 
*: p cultiir^l^bias of : tjje I^*; test is justified by some who' argue as if the I.Q. 
test verVj-re^ly^w^^iwati^ test for all audiences; i.e. that mainstream'' 
culture and Jhe schoors; require a certain^sultiftejs vocabulary and tha* certain 
. ' . • types of problems -fee -solved. Ihat argument represents a shift from thinking of 
the' ;;T . O . ^te s t : 5 ^|[S> 

device. , The sftifi is a major one which fiises-^Wo types of uses unsystematically 
• into one discussion. . .' ; 

- ' - • .-: : > . ' ■• i ' . ' ' .. ■ : ' 

: Among the current uses of testing and assessment, .the most easily justified 

is the assessment of achievement. ..There the major issues for -particular 

cultural groups on ' this type of test are £Oflfcsn£ validity and coixiunication . 

accuracy. On the other hand, where I.Q. testing is involved, the nujip tissues 
. JEer particular cultural .groups are construct validity, as well as wionostic *' 

Validity. Here, there are quite, general grounds for questioning the vSEidity of 

the irstrume\ts ; ; *(see- for example Houts ,1977) Indeed, ' the 1.0. -sc|e'' is .j 

worthless piece of r lh^^ ' ■ ' 

7 There are, mar^ ; ^ the/consideration of audiences us£§ of 

I testis ,'can bring cla&fy;;to a* heavily conf ynded a[rea. : Whsrt, tests r aig^ed across; 

a p^atdigm Becomes an Impe^|ivef . ; ' : 



ISEKSrriVE ASSESSMENT 




PRINCIPLES FDR CDI/I 



" 

Any test or .a^^Wifc procedure w^ch resppnds to and usei^^c^t^e of 
-student? would f ollc^^^fcairi princij^s. I *bel ieve . that the foii^inqt . would 
: , r estil ~iri. greater v^S^ 5 ^^ " ' " ^ 

■ «' * * \ ■'■ - . ••..•'." t ' N W^fr^fe*;-' 7 v . i^SteS ■■ 



arps must " ref 3/ecfc,^ 



^ Ernie 4. 



sensitivig •^^^^VunlQEr^- : culture "5ff the If 

illustrates inr Imprecise wa^hw culture?^ (langu^e) specif ic. tests ^ 
might be constructed. "By <pse \>f . such, culture, -specif ic'test§ f it * 
should t I^^ssible .to detenrtine .if .a child is speaking fetnd hearing in 
nitefciy • wltik avsg6cif tc lifigut§tic ©OTnuriity. If so f the $ speech 
asses|pr\ • woi^avfel^ pathology, a reading teafiher would role ^ 
"reading errbry^ -aP^ ^.^u^ ^a^u -ij^-x . ^ — ---^ 

: when> basfhg.^fedc; 



egnsonai; 

2. Tei 
learner*) 
from' the 



^rsi 



rule out rnenta. 
• <kF tfie^gj!lds : ; "dropping" v of final 





ai^ assessment propaStifes .must yield a description of the 
?e|plre>, not . sin&y the pteafence or absence of mateM^l 
1 - * • - x 'Lefc BeejT 1978} . V; ^ / 

musp. yield. a de^ri^5&k)ri of 



r mcycers r§ 



gts > and assessment procec 



>leacn^s?^p^esses f not simply the contenb of t;$spdhses to qttesdlons. 
V(Piciget f 1970) (ffifym an^ parton> 1979) 

^aeeSsmgnt proceduc^^must yield, a' JBescr ipt|ai^f \the 



- learners pr^%ss r%0 not simply ti^&ltrnirS; status at a i ^ - - — ^ 

-f . • : time. . Tb; be m^Mn^il the ' assessment bf . progress ' must^ tte^acdcxi^panied A * 
: ; ly ' bw s : 'deseri^iad .Qie. ' teaching ; serviced wMch • wire pB^HS® to tha^ 

f * I (Kt^le^w^i&'Koehig) \. • ..'"-v j^^.^^. ^ '^^^ 




HUMMED 



5; A Tests afB* assessment prcKi^^ • 
v teacher/learner and pr-te^er/^i*r 

* testers are non-student; ■»utTOS.M.;.v5tti^«i /(Hist, 1973) ^ffililbn, 
1974} 



6^ Tests and w assessment procedures must yield a ^description of the 
ecology of *the testing setting, / \_ 



^w7. Tests and assessment procedures** must be related clearly to a 
- valid theory . of healthy or pathological functioning and valid ' 



intervention. 



Anong other. aims, ^systematic testing and assessm- 
us^0 to assess sbanass. in learners> to reveal 



to guide teaching strategies, Nfcw-lreight aiid e: 
result from good assessment. C_ 1 




^should be < 

and & 



'ion should* 



4 ; - 




. W ' ^^ ^^^ 




""-JEf-"*-* - • 




A' 



:tr; 




0 

ERIC 



> 95. 





FESTERING INTEItECIUHi DE7EL0IMENT 



Howard E;^Qruber 
iJtob ertrT. Keegan 



Institute for Cognitive Studies 
Rutgera University s 
Newark* NJ 07102 . -j 

■ <& 



in this paper I wan£ to describe^ method of /teaching that grew but of 
'. . joint . concern for on§ product of the ^valuation industry and one. aspect of* 
teaqhing. The .story has three., strands/ First strand: -When Arthur Jensen 
published his famous. Harvard Educational Review paper," apart from certain . h 
. tect^cal - criticisms, 'my jhaihi reaction? was tfa ...dream of a* demonstration •'. 
e^eSlfient that would show clearly that rntellectual : " functioning could be , 
;dr^sti^iiy modified by changes. ;in educational practice. , ' This/: w6yld .make - 1 
" nonse^i^ Of • t^e heritebility argument as Jensen Wed it* Dobzhansky (1972) has^ 
- siiice^r^ of, c^nee^t^ -Mi fie bap ' 

ii ^4iis .^ed^tical argument orv results obtalnted fxbm ■ experiments y£tK fruit flies* 
v V M^- taking xjn along, sdmiiar^ lines,, tot I .was, Interested in ^i%n experience. < - 

. . ;. : Second Jfcrandi. ' My . Chi#3ren attended the Free Scho^ of. pergen,' County/ a •/ 
high, school outside; the public^ sch6$L ■ system, run ? by , the children th^ffeelves, 
that had a itoticeabl^ longer" l|?e- tiito^ai^ similar ventures. • I taught there a 
*' little. <6n the first day I arrived with r a carefully planned lecture on imagery, 
a .topic that- forms part of my research .interests . and that I know interests most" 
people. There\were 20 students on a rug and 

sitting on cast-bff couches, j I chose ^ s P°t "Ti °A _ a piano „stool by an\ 
old. piajio. Looking around, I wondered iijg^ecfcure was^a good * way to bjegjii. ^ 
. Entifeiy on impulse, I told than in J a ^ntehce or t$p> about synesthesia' — 



sometimes an auditory stimulus elicits a visual ex^ience (or ..other such. # 
- combinations) . Then 2 I asked than to close their eyes, and f ry#to see. something I 
# when I played a note. ' We wait around the room, each person ^describi/ig tf|iat he 
. pi she saw. Both the- . 'diversity and the conrnorialities were :^intri^uing; The * 
. class 1 s attention engaged, I drew breath and was once^more , atfcut- io % start my 
, lecturp* Soneone called out, "let's do that again! 11 i compiiei. We repeated', ^ 
; over arid ov^r> with many variations. New* facets of a complex . process emerged— / 
^ Try^tb imagine a-' pure color,' or a scen^with motion, etcf. Thi^time, no .sound, 
^ ♦ i imagine your breakfast- tat&e, . (fehades of F^ncis Qa^n^^^tc/ ^Very 
: occasionally, I made a remark about psychologists 1 previous , v^ST ^on . visual 
imagery. Suddenly our t time was^ up — 1-1/2 hours have f^xm^by. lecture^ had 
t -" v become" irrelevant. Actually, all the essentials came ylijp one wa^-o^ apother in. 
, ' ' our expirations. The students had discovered almost /[eyBry^ : afld 4 interest • 
>. had not ^flagged for an instant. (later, .when*I thought .about v ttle : relation 
■/■'] .between/ what we had done ahd_t^e "discovery, method 1 ! it struck -me -th^t one 
• cbimportant difference;^ that I had had ncJ set* objectives nothing^ in pai;ti<?ular 

them, to d_i$cov^)_r 
flyfh^TuS and wonderingv '"why can 1 1 college : teaching be as- exhiliratin^ as 



e blit" . of , the school 
: as- erfiiliratinq as 

that^ s - r • % " • x v ' : ; ~:: f .^ :. : ' • . s 



ERLC 



MICROCOPY RESOLUTION TEST.CHART 

NATIONAL BUREAU OF STANDARDS. 
' STANDARD REFERENCE MATERIAL 1010a 
- (ANSI and ISO TEST CHART No. 2) 



GRDBER/KEEGfiN 2 ^ r . . 

^ Tbird strand: Over a long period, of time, J received ingui ries f ram Jgie 
Academic Pouhdatibri^ , Department • at. Rutgers Oniversi^Newark (a_; Seek-type 
program): what are cognitiv e s kills? how can we develop them? * I resisted 
"gi v in g— any ]»sit5ve-^^ 

Department was terribly addicted to using workbooks * arid other . ' ihechanical 
devices; taking a routine and Skinrierizing attitude toward- the question of 
remediation. Eventually f . I began to feel ttat my attitude ought, to be more 
constructive, and I began to wonder, what we, as an' -Institute for • Cognitive 
Studies, could do. My general approach was this: The human mind, is a wonderful 
instrument. When it is working weir, .people can do what, they want; learn what 
'they want and need. . When it*is .not working well, % all the Workbooks in the world 
won't help. Question — How, do you get people to think better? JE put this 
question to a graduate" seminar as our term*project. < Their first reaction' was . to * 
raise an ethicaL protest: Who were we to tell other people how t<5 think? 
Struggling with, tfri's problem had an important and T\ think, profoundly beffef icial 
effect on the program we worked out. , •< -r * \ ■. ■ 



J GN TEACH PEOPLE TO . THINK BETTER • , ' r. - l 

I shall talk about the -need for and possibil ity of educational pr6grams - 
which make direct attempts to teach people to think better. Most of what I have _ 
to say will ' deal with the sieed, rather than with a detailed^ - description of U 
. actual programs we hay conducted in, several letting?;' Our work jis only a few*.* 
years did; we are still in the inventing phase, and not yet jready to proclaim* V 
bur methods from the. housetops, ' ' * • ^ X 

: The aim of the* program we have-been developing in Newark is to develop a \ 
method for teaching, people to think better. While our - primary goal is in the . 
field of innovative teaching methods", a fundamental psychological' question is 
also .at stake: can we alter the course of intellectual growth ■ after the early ./' 
formative, years of childhood? * ° \. \ ; . > 

I begin with three examples to show that the acquisition of Verbal and 
symbolic skills in a conventional way, even, to an exceptional degree, do not 
necessarily indicate equally satisfying; intellectual functioning. t At the level 
of professional life, the Soviet psychologist Luria described the . now celebrated r 
case .of Mr. S, who was gifted with extraordinary powers of visualisation and 
memory,, was not in oth£r respects a particularly gifted person, and in some Ways 
he was rather limited. At the level of graduate school ~pereformance, it is : now 
notorious .that High . scores on the. Graduate Record Elimination, * emphasising. . 
verbal and quantitative skills,, correlate very poorly with success in ; graduate 
schpol. At the* levej of undergraduate performance, Professor David Griffiths of. 
Essex County. College has recently shown that students receiving a grade of C or 
better in college level introductory physics and chemistry courses have often - 
achieved this success without being able to reason at the level of formal 
operations as' described by Jeao Piaget.' What is perhaps more significant in the 
present context, Griff iths found that/ in, a typical state university population, 
there were many students, who seemed to have succeeded . in their science courses . 
on the b^sis of a thin veneer of verbal skills, without any general or abstract 
grasp orwhat they were learning; meanwhile^ _at a ^ nearby conmuni'ty colllege, the w 
students performed as well — or as badly, if you prefer — on tests of formal, 
reasoning/ but- lacked the. aforementioned' verbal veneer. Finally, as might be < 
expected, there seemed to be at least some cases in which this verbal veneer . ; ; . 
actually, got in the way - of good thinking. ; . Someone once said, 'Words are a 



grdber/kebganH..^ • ' f , 3 *. 

writer • s wor sfe* ^ifei-'' j .The rest of us should watch but for them too; 



VEnsof af ; :as. Valuation" is meant to m 




ERIC 



learned, aS^ate^tsbaie^ of it should bi^dorte long after the learning 
ex«riehce#.Rat*4i;S ;? ' the way to find but how well education worRs^ But 
there Is 'Wlliipl'": research bn r?alil long-term retention* and jnost 
teachers h^fenb^Mormation at all abbut ihe abiding consequences of their 
work. The IferliSr&tfr %iust someday be faced that long-term retention is 
b©iMa- ; np" wi#''amp'^a'^L.ngi-- And understanding, cannot .be achieved^, by 
skipping quickl^febjgft- 30 topics,/ 2 per week, as- in)many a ■ 
course; Aii ar^^Wti^ ^Lp to bring out; these points: 



. one of thfcrtteht authors /has long been interested in people's.'! 
understanding of^»^ary,:^1sie^ principles. In a relaxed and pleasant J 
setting at a surat^r Miace , by-way of explaining this part of my work to an 
bid friend, I' afke^vher a simply question — "Suppose you are in a closed 
railway car travel^ih *a ; .str^Rt line at constant speed. You .stick, pat 
yodr hand in the aMejand drop a tennis ball. Where does it land? .-After 
a long, silence, _£he- bursts • intb "tears and sobbing. "Margot! why are you 
crying?" "Because I^got;.an A in college physics!" Frbm/'a ■ strong*,, 
traditicsnai' ;imv&sity^|S^t:;6dai.: as-'. :. ■,..[■■ f .! •? 

■ This t^i^rifWy^;^^^^^ : '&^. r ^^i but I never understood 
anything, " — a "remark'^rgolf ^ wade;; later — arose also in a ;series of ,„ 
interviews on 'li^it^^JS^jgXe^e'; aj^-uffM^st^uia±ng- cctnauctea by Professor 
. &idrfi ^,^e^'^-mi0^0iJ^*:- 



3 V 



* I do hot mention these/ fiacts - in -order to denigrate or minimize the _ 
importance of. organized knowledge and fundamental symbolic skills such as 
reading. 'But it-is vital £o see the process by which such knowledge and skills 
are mastered in its totarpsychological and intellectual context. 

imagine a hypothetical case, 'for example, an individual with a* miserable 
high school background, now in his^or her middle twenties, finds his way back to 
school in a community college setting! He recognizes some fundamental 
deficiencies in his academic skills and wants to correct them. Now let me add 
one further premise — -that this individual^ is What we really mean by a good 
student -- someone going 'to college to_ improve his mind, to ; have a rewarding 
experience of personal intellectual' growth. _ ' 5 / : 

For awhile, our hypothetical student may be cajoled or, coerced into various 
training programs narrowly focussed on particular problems of remediation. But 
he" is too mature and sophisticated to be very long seduced by the allure of 
getting gobd grades, and he is skeptical about any promises of a relation y 
between grades and eventual success in climbing some career ladder. The good / 
student wants the remedial work if ana only if it is clearly a part of a 
rewarding experience of personal intellectual growth. , r 

Our hypothetical student may not be aware of all these subtleties at the 
outset of his post- secondary education, but as he progresses in his walk through 
the groves of Academe, he becomes increasingly aware of the fungus on some of 
the trees, the dead wood, and the stagnant pools across paths going nowhere* 
With a little more luck he may also become aware of, or begin to dream of, 
another part of the forest, where tilings, are growing better. We may flatter . 
ourselves into thinking that we can keep him pointed toward his workbooks and 



GFUBER/KEEX3M V t ' ; , 4 . - 

• away from the Tree of Higher Knowleage^ . .But the secret q^ts _ .out.: He has his 

• own W^b of khov^ing that when he is/all done with hi s v remediation , things may be 
\ no better lor , hint tiian-^ey were#or Luria's Mr. £1 gr for Griffiths' subjects, 

° ^Ftor a^i' wejtootf, i^r in^ . i^ety- vfee :the • Bitter souls — the more perceptive^/ 
] >hane'^ t ^.;^a^sd■ hardier in^vi'duais;:- V who 
^t^iar iijnitatidhs of "S^^^ to turn : away f ronr it^Jt 

' may be the good student — ^^^^^^^''^ rewarding "experience of ; personal : 
intellectual growth ~ who mo^l^^^Sy see$ ;thC main, danger of college/ life: V 
not bnly does it di^ppoint>;;i^^^^^i y s^il ^ui; mihdv ' ■ > ■'. - ''A-/ i • 

' > . Some of you who .; are ^ctxatre : : m^; t±i^"tiiat^gobd students such as 

•the one I have imagined at e -Eare and vanish^J, ;• -But in our Ptacticum ; for ; the 
Improvement of Cognitive Functioning, where ^Ch^ssues are brought otitylnto the 
open^we find this kind of ,good student to be ^y^luie, riot the exception. And - 
Hans* Furth, and Har?y Ward in their book, -^INK^ PIAGET'S 
THEORY IN PRACTICE, "were describing the attitudes' o^th^ child who would become 

: this young adult when they wrote, "fee pennutation "#a^e is a deyelopmentally 
high-level actiyity which carries its^i^ri^ is 

/often experienced as lew-level activity. " ; (p. 271) | : ; V ' ^ ; Hi 

. . If you struggle ^ith the question. "Hew do you get . people ^to. t^ink ! 
better?", a number of reasonable responses, co^ to mind. ^£ plurality of these r 
responses should not be viewed as a problem but rather as indicative of the^fSct^ !' 
tfeat "good" thinking is not a monolithic process. "Good" tJri,nkirig is' jproductiv^ . v :| 
thinking N and it requires many complementary component :-skills; In the attempt tp-h; 
develop an effective program for -improving the quality : of fen .individual 1 ^4 
'thought processes, several; thortes forcefully emerge. These central ffiemes serve S 
^ as- -a. guide for the de^Loanent of the particular tasks or "situations" that wd 
utilize in the class imtm^-One of these themes has al ready been all uded , to 
namely, the threat advantage. of haying access, tb large array of * cogniti^e^skills^ 
A large array allbws for ilexibility of thought or, stated a litle differently, 
it enables; an individual to ha^e more th^ri one way of thinking about whatever he 
wants to think about. Repetorre ^nlaKgeijient then,, is one of . the central .thetoes 
to be discussed. • < ■ ■: ■ * , .. ■ . -,: <f ■ ' V • - : 

■ k v ' : , ' ■ . \ ' VS..- - • • :/ • S_jSS l_ "- ' • ; - : « • 

■'■■'■•w" The classroom setting is well suited to the task; of enlarging a'gtudentVs 
repetoire of cognitive ' skills because it contains diversity, a key ^orient, in 
effectuating this, expansion. Each student has his own way of approaching a^task 1 
and in many cases^ the student, feels that his particular approach is the ohl$ 
conceivable method of operation. However * the inevitable diversity \ of 
approaches among individuals^ in. a classroom provides a- rich natural resource for 
exploration. In order to take advantage of this poo! of diverse responses* the 
teacher has to assume the role of a moderator, rather than the more traditional 
"lecturer" role. 'The teacher, focuses the dialogue among the students, 
emphasizes certain points, and does some degree of synthesis, but the "food fop 
thought" arises from the students 1 interaction with each other. t Dialogue 
off eres an individual the opportunity to see hi s : own thought processes and 
capabilities mirrored in others. Feelings of,. "I never thought of it in that 
way tjefore" or "That's where I was going wrong"/ are compelling learning 
experiences. , " - ■ - 

• We see then that « the .expression of diverse approaches* to a situatioo or 
problem can expand ^he' .repetoire of cognitive skills of an individual by making 
^irii aware of a^roa^hes brj strategies which had never ^before been avail^le to 



9 

ERLC 



99 



Efiis -But: this is riot inou^ 

confronted with a puzzling situation calling for a novel response, -the 
^Tiaividn^3r^^n F tiave acces^:<^a group- of .people with whom he can enter int fc 



a dialogue for ^ the ptirpose of expanding his range- of possible responses; The * 
individual . ndist be a£>le to generate alternatives on his own. The model of: * 
external dialogue has to be internalized for it to _M of . ilasting, wo|thV..:.This' - ; 
internal dialogue J s an ; irtrlnsic part of a reflectwl' cb^itiye;Style;' 

fctive 
litive 



ntv is anotner *of _the central themes ^ concerning pr< 
thinking we ' : referred to above. Reflectivity is not one particular co< 
skill , but J rather a constellatipri of skills that can be thought jot as 
constituting a cognitive style. Promoting this style Is ,a cornerstone, of our 
approach to education; - 




We have already .described ref 
' This description, howeve; 
a, fully composed sdbvocal conversati 
dialogue, the essence of under standi 1 
the point of view 3 of the other, 
saying, and a certain degree of 
section deals with the role of "poii 
between the lines" in B the internal 
*• , style. 



ectivity as having the flavor of ari internal 
should not be taken literally tc/mean that 
jon takes place in the head. As a normal 
ig derives from' the attempt to /reconstruct 
ie effort to attend to what the other is 
reading between the lines". The following 
of view" , "paying attention" V and "reading 
ialbgue characteristic of the reflective 



The term "dialogue" presupposes 
reason it captures the essence 'of the 
conceptualizing 1 reflectivity as con 
discussion of the (theme of the ex; 
the dialogic process in bringing ^abou 
in that way before". The construct 



more than one point of view and for this 
reflective style much more accurately than 
isting of an internal monologue; _ In our 
I repetoire, we pointed to the efficacy of 
an experience of^"i never thought of it 

, on of a new point of view, which often 

consists of "restructuring familiar material/ constitutes ^n expansion of the 
individual 1 s repetoire. Moreover , exj er iences of this^ kind should eventually 
communicate the point to the individual that the £S£€icular agrcoach that first 
- - - '-' - 1 possibles This rea3^atibh, in and of 

many, ■>• The f unctidn of internal dialogue 
solution" to a situation but to supply the 
s there another way of approaching the 
^ on in order to cl arify it? 11 These 



comes to mind is not the only approach 
itself, is a surprising revelation for. 
is not so much to denote the "correc 1 
right questions, questions such as, 
situation?", or "How can I change the 



tygks of questions can serve as guidelines for constructing a new point of view. 



appear trite tc 
"paying attention 1 



It may at first 

understanding, but • . , 

reflectivity. All through childhood 
parents,- our teachers, and to., numerous 
but .'we are seldom if ever told' td*_ 
requires an. individual to pay attent j 
attending to yput cwn thought ; processes 
to be of little, use now assumes great im; 
re£er to is Jthe coimiissibn of errors*. 



say that "paying attention" can help 
hats a special meaning with - respect to 
we are told to pay attention to u jour 
ithers who have some degree of authority, 
attention to ourselves. - Reflectivity 
to his "line of thought". - By closely 
whole set of experiences that was felt 
•rtance; The type of, experiences we 



* Piaget has shown us the value ofl attending to errors in the analysis of 
children's thinking. A careful analysis .bf the protocols of children who- commit 
errors on a particular task can reveal information .about the structure, of the 
child 1 s thought which cannot te/getermimflyby looking: at "correct" or successful 
Why 'shouldn't ;/ the:same bfeifrue at the adult level? Errors can 



ERLC 



00 



provide a v/ellBpring^of ^i^omatieh iok the /att^ti^c^izer p |&r/_cne; ttingir . . 
errors can be used to isolate' problan areas in a iihe of thought. In striving pv^, 
ta gn^ a^na^ uri kn own^^ at. vMgB^ , ■ 

certain decisions and 'assumptiois inust be : made, in : order L to cairry /forward the^.^ ; ; ; 
cognitive task: at hand. ... When thfe cognize^ gaii^ 

either passively acce^the/OTrrection and; proceed' accordngly ,&**he can 
experience to identify) the point at which he went wrong. ;; In and x&i ii%s^^^.\y, * $ 
commission of an'error^s riot a valuable tool for. learriirigVi^t it can f fcc^^V^; 
as such if the learner takes the opportunity to di scover ; lie 4 di screpancy bfetw^if :|^ 
his chosen approach: and other, possible^ approaches* .Reflectivity 
periodically "rewinding" . the: stream of thought/' identifying the-|^ints^^p^;v . 
Conflict or .ambiguity whete the error: arose, and generating; alternative courts .;;;rv' 
of action, (points of view) . V ; . ■ /'Til r '?' r v/fe V/- 

Earlier we stated that understanding a dialogue entailed a certain degree , ^ 
of "reading between the lines" and that the same process was true for the - . J 
internal dialogue . characteristic of the reflective style. When you "read --' 
between the lines" you glean information that is not. explicitly, present but/ is 
somehow hidden and implied; The transformation of implicit knowledge . int.6: 
explicit knowledge is an act of creative liberation. Explicit knowledge can be ' 
used in a manner in which implicit knowledge cannot. Explicit knov/ledge , is / 
definable, mobile, and versatile by way of these characteristics. Implicit; V 
knowledge, on the other hand, is amorphous, frozen and non-versatile because it r/ • 
is so deeply embedded in context. Where knowledge that ^is explicit can be 
utilized in numerous Contexts, implicit knowledge cannot be utilized in such^a ' 
way because it has not yet been differentiated from context, and therefore it is 
confined to play a limited and limiting role in the cognitive life of the 
individual. > 'V .. ■ • r . . 

The recognition of the role that implicit knowledge, plays in cognitive life 
is reflected in the concept of the presupposition. 'In even the simplest 
statement, numerous "unconscious" assumptions are made. For instance, in the ^ 
simple request to "Please, pass the sugar"', the speaker assumes * that the person 
spoken to can understand/English, is physically capable of carrying out the act, 
is socially inclined' to cooperation, and can identify sugar whether it be » 
contained in a bowl, -packet, .or cube. Of . course, each of these assumptions can 
be further divided into additional assumptions. The supposition that a person 
"understands English" involves assumptions concerning lexicon, syntax, j 
phonology; contextual meaning, etc. While presuppositions are certainly a 
necessary component in "economizing" cognitive life, unconscious assumptions can 
also prevent productive thought from occurring. A prime example, of how an 
assumption can debilitate thought occurs in the form of the syllogism. In order 
to correct a syllogistic line of reasoning it is necessary to specify the ' 
premises on Which the reasoning is based and to root out the tacit implications 
contained in the premises. The act of making the implicit assumptions explicit. . 
liberates one from the fallacious line of reasoning. ".-.•."< ' • •• / 

Implicit knowledge forcefully 'affects the construction of a point of view, .. 
and a point of view serves as a guide to action and furthers .thought about a' 
subject.; If I assume the world is flat then I will not attempt to sail "around" 
the- earth. If someone else tries to sail around the earth and they; do not 
return, I Will then confidently conclude that the foolhardy crew fell "over the 
edge". Facts get interpreted in a manner that makes them consistent . with the 
harbored point of view. . 5 




" - •7-".l 



GHBER/KEEGAN^ '7 , ' ' • \ ■ . ^ ■ ' ' .. \ : . ' J""'" ' - ! 

: '\- • A recognition 7 of the potent role of implicit knowledge in cognitive life 
and the intention to root oug the presuppositions of a 1 ine of t faoi^ Ht^te ^ 
powereful tools in achieving the development of a reflective cognitive style. A 
modicum of -playfulness, or "mischief" can be helpful in developing this 
reflective style. Assuming the role of, a "devil ' s advocate" can also be, 
instructive. v For instance, if you start with 'the assumption "the earth *is St 
the center of the universe", what does this assumption do to your thinking about ; 
all the other celestial bodies? This type of game playing can be quite 
instructive and liberating in that it helps .to specify the place of assumptions 
in a line of thought. _ , ' ■■ " V : ''•••'•'•"» , - 

'■ ■ " ' ' •" : "v •• •.. J.' • v .. , ,.• .« , ' 

From the preceding description of the goals of our course it. should be 
evident that these goals are not specific to a course in Introductory 
' Psychology. The skills outlined here are applicable in a wide variety of 
contexts, and' that is precisely why they are of significant value. Later, we 
hope - to show that hot only the goals, but also the method " we utilize 'is 
compatible with the teaching of other disciplines.' The specific subject matter 
we worked with should not obscure the versatility of the method or the validity 
of the underlying goals. ... ;.. ', 



ON SOWING A BOARD IN HALF, THINKING AND SELF-CRITICISM 

In conventional education the functions of instruction and evaluation are 
kept separate. First, the teacher teaches and the students learn. Then, the 
teacher tests, then the students show what they have learned, and then the 
teacher evaluates this performance. In conventional education it is not thought 
bizarre to separate the person still further from the process of evaluation. 
The "test" may be taken away from the student, sent to another city, put into a 
machine, and transformed into number that bears absolutely no resemblance to 
what the learner learned. „ 

. ■ : •* ■ _ / , « _ _ . - _ ■ _ . ' 

Not all human activity is organized in this way. In some instances, 
performace and evaluation are inseparable. The carpenter rules a line and uses 
it to saw a 'board to a desired length. Every stroke of the saw is guided by the 
line and by the immediately visible performance. Corrections are' not made 
because a .third party ordains it, but in the dignified transaction between the 
sawyer and his work. There is a vital correspondence between the "test" 
administered by the ruled line and the work being done. 

' ' •. •_ _.' _ . ' -.-"J-' . . 

Let us examine for a moment these two educational structures-; that arranged 
for the coimiuhication of knowledge and that ordained for the evaluation of the 
student's success in playing an appointed role in the communicative process. To 
simplify, we will consider only one type of class, a typical lecture course. 
■ . ■: :■' 

First we look at the structure of conmunication. In a typical lecture, the 
'main activity can be described as one^many and one-way: one teacher talks to 
many students, and communication is almost entirely from teacher to students. 
Even a sensitive and concerned teacher has little opportunity to know what the 
students are thinking. They are silent. After class, ,it would be unusual for 
'the teacher to look at some students* notebooks. 

The teacher " prepares carefully, works hard and continuously in class. But 
he, or she works largely in ignorance of what the students are thinking meantime. 
Of course, we teachers tell ourselves about non-verbal communications facial 



9 



ERIC 



\ GFUBER/KEEGAN * 8 

• expressions, and occasional questions — but all this gives oniy a very blurred 
reflection of • the richness of our. knowledge and thought we exhibit for our 
students. . ; • ■ .' . 

: : : • : • — " : ~ : : : ; ; [ : ' ; ' : J, ft — — 

To find out hew much they have absorbed, we must typically wait ^ritil it is 0 
time for the test; Unlike the sawyer at work f the test is l^uail^ considerably 
separate in time from the rest of the actiyity. Moreover, the student ;dbes not 
evaluate his own performance. Often enough/ he has only a vague idea of the 
criteria the teacher has used. * Even when the teacher tries to spell these out/ 
this explanation is seen as part of the structure of evaluation. Time spent in ' 
it is therefore time stolen from the more important structure of communicatibn. 
Finally/ the* test result is given back to the sttielerit ^t a still later time* and : : 
transformed into a number. This number jnakes all sorts of administrative acts 
" / possible. ' But^-it does not qonvey.at all the teacher's impressions 1 of what the 
student actually did . 

Some of the consequences of this arrangement: - '. ''*"'[ 

1. Time taken for* evaluatioif"is minimized because it is seen as 
m al^en to the corrmunicatidn of knowledge. , 

. -. ■ ' ' ■ ' ' '. ■" _ _ ■ _' * _ - - - ■ '. _ V • - ' *' : 

2. Although the teacher hopes t he students will focus their 
attention on the whole of what is being corrmunicated, the students are 
using their ingenuity to figure put what fragments will be evaluated. 

Tteacher (at end of inspiring lecture) : Any questions? 

4 a - . . \ . i 

! . Student {.raising hand eagerly) : : Will that be on the 

■'. test, sir? " * : ' 

3. Little attention- is given to providing the student with 
internal .criteria for self-evaluation. The student lives in a world 
where, for marly formative years f conmunication, performance; and \ 

: evaluation are kept separate, and where some 6ne else has the 
„'.;.. responsibility for evaluating the work done. ■■■ . 1 ' 

- Now let us suppose th£t ! we take seriously the educational goal of helping 
\ / students to. become (or j perhaps- simply to remain V independent human beings,, 
■■■ ' > interested in and capable of evaluating their <jwn performances. 

■ " -'t^ * - - : - v.: i - ■ __ 

What are some of the things £ professional wdrker does?- First, he or she 
. v i has internalized grijteiria fid.a continuous sense of whether or , not the work is 

going well. Second, since criteria are not so easy to come by, he or she spends * 

good/ deal of^ time developing than — talking with colleagues, reading a ^ 
fcriticaj literature, reflecting on his** or her aims: . and progress* Third, / when y;/ 
\.V "outside" evaluation \is/ needed, the professional thinks about whose opinion , 
x might be. helpf ul^and . seeks it out. ^Fourth, this opinion is not sought i*i order 

_tb, put - an alien „ .number. . . in . a_. record book. ... The. worker wants cri tici sm that ,„ 

, corresponds to, is germane, tQ, captures something of the work itself; such 
critical commentary is often a re-description. of the work. And finally, the 
criticism is •not. merely listened to and the work then put : away. The worker 
r * • alters the work in some wgy that is responsive 'to the criticism. 

" J It. is a striking fact . than none of these, attributes characterize the main 
'evaluation processes usedj.n formal education. Conclusion : we are hot teaching 



GFJJBER/KEEGAN „ . 9 

our students to be independent, self-evaluating human beings. 



" O u t explicit, goal in this project has always been "to help students to 
think • better." Almost inadvertently, we have found that -we l#ve. also been 
exploring new ways of coordinating the structures of communication and 
"evaluation ' — - -and ways of modifying the process of evaluation itself to 
encouraqe the students to become people capable of self-criticism. 

•"Ihis<new perspective on evaluation was a -by-product of our work. (1)_, But 
now that it is there, it seems obvious that self-criticism is an integral part 
of good thinking and should always have been one of bur goals/ Sometimes making 
goals explicit facilitates pursuing them. Our work will probably change now, 
with this new-found recognition. • 

METHOD : * ■'•'/■;• , , 

in our attempt to help our students, to think better, we have had to depart 
from file lecture as the primary means of "educating" our students. Instead, we 
have utilized three alternative classroom structures ;- or formats • that 
substantially modify both the students' and the teacher's conventional roles. . 
Each of these formats has its distinct advantages, but, in the main, they all ' 
require a 'student to actively participate in a set task and to reflect .upon his 
own course' of action. By the sameStoken, these alternatives also- require 
restraint and patience on the part of the teacher. The student .must be allowed 
to pursue his,, own method of dealing with the task, to make his own mistakes, and 
to develop his particular line of thought free from the well intentioned but ill 
timed intervention on the teacher . By allowing the student to discover the 
subject matter for himself, you largely obviate the necessity of lecturing to 
him about it, and class time- can be used In more flexible and productive ways. , 

The "Round Robin" format has the great ' advantage of ensuring the 
participation of everyone in the class. , What typically happens in this f ormat 
is that a task is described to the class and everyone is -asked to work on it. 
Following a period of individual work on the task, every member of the class is 
asked, in turn to report on his experience in dealing . with the task. It is 
essential that every student be heard from and. that none be allowed to withdraw 
from the process. It is important not only for. the student himself to become 
acquainted with this method, of analyzing "hi's own thought processes but also for 
.the other students to have a point of comparison for their experiences with the , 
same task. 



9 

ERIC 



The round robin format is not -without its drawbacks. ^Inevitably, ;< several, 
students in any group will say things such as, "I did it the same way as John 
did it" when asked to give their reports. Replies of this sort do not have to 
be accepted. The teacher can carry forward the' discussion by asking the -student 
to describe in what way he f elt his experience to be the same, in .what wet it 
might have'diifered, or simply to put the experience into his own words. -It is 
surprising how often this technique will uncover some new slant- on the material. 
It also prevents other students ;.f rem , attempting to withdraw from the situation 
through this' "me toe^lchnique. * . . v * 

, Another aspect of this round robin structure that may appear to be a 
drawback involves the* time element. It takes time to go around the room and 
listen to the report of each student with care and interest. There is no quick 
way to explore the diversity. of responses, identify the cenmon elements, and 



GRDBER/KEEGM IP , . , r * v 

onphasize the_pTO&ctive a^ets b£ the material that arise? £ rem a classy TSis 4 
restriction berornes more salient with lac|e cdass sizes^' ^e way of deeding 
with this sitiiatfon is to use ipibd robin injunction with bt^er fbpits 
that also promote a reflective cognitive style, . , x 'i, 



The "Small- Group" fprmat also' involves the initial . presentation l a 
problelm or task* but with this "format 'the active work bn >Jhe task occurs in 
{groups of 'four or .five. .The manbers of the groups'' are encouraged _ to * freely 
exchange^ their viewpoints, offer tente tiv^plut ions , and work toward gbccessf Ul 
completion c£ task, :". Working in a smalT group provides quite^ a different; 
experience from an individual 1 s encounter with a* task, interestingly f what is 
often hidden^ f rom yourselves, is clearly revealed in- others, _ The rouncl robin 
format jriakes, it apparent that describing one 1 s c^ thinking i^ 
but 'some of the same students who; find the self report-sd difficult can facilely 
describe hew "someone else in *.the group handled the itask, and hew the group v as a 
whole proceeded J Over time, the/ experience of attending to .and. describing the 
group process, as well as receiving f eedback on his own . contributions to the 
group/ should help the individual to develop an analytical , sense of his own 
thinkihg,. When it comes" ttrte for heating "the reports on the progress of th<* 
groups, this can be handled by ^having one, several, or all the member's of the 
group give- their descri^im^V effectively "creating a round robin ;tyj« of 
situation* T^:e is hqtii^^ : or sacrosanct about; these formats and:- they, can 
be/ used in i rft er est i rig ; com^^a ti ons ; • '? ' ' i-jif. ■ 



\\ The distinguishing characteristic < of the jEfemoistration" fonnat' is the,; 
departure from assigning the identical task to everyone in the class,? From v time 
tip time v/e have found it laseful to split the a few. .large groups arid 

assign: each ; gtbup a slightly (different variation of a task* to perforin^ This 
i Structuring % of the- class enables us to single out specific factors] (i;ei' 
brganiz^tidh) and examine their irrfp.uefice on tiie - way we think, ^ 
Ich^f .^rpose of the exercise is^ to pfcc^icle a clear and cupelling dOTpnstXati.on 
"!io|: the; #lebt^d- factor^at work , th& differences between ? individual approaches to 
[the ta£k can ;and«should also ;be examined. The round robin can - : . Ke 3 uti^ized^ t£> ; 
.aitcomplish 1 t^iis aspect of ;tha exejrcise.* ■ ' [ \ . -i' . /-/v.'v - : ** : 

: .\ - : -v-' .'■ V -- " , \ ' • •;,/ . '■ . ' ... _ ; •' • ' i i. ,. lil .... ••• 

.' : IV i '••01 re^sion;/ . a • '^iiijt-lecture 1 ! at the end of a class ^session tb 
clarify a paint ,or extend a topic in;a rfew. direction, P Mthpugh' this is a ^^l,f 
concession tri'\thfe conventional: classropn^ inode]^, we justify?. |ts, use by keeping, 
the number of 'suc^ at a npLnimum,* placing themV-^f ^the end of,; the 

sessiortr ;and toy/" rnaking : ^ure that ,eaeh mini-lecture iq of a! .short -^rat$pn. 
However y jp\is v th| toiiicfi 'x&ffiij®^^ fopfi.ats £hat 

qqnstitute* tfe essence of : bur aj&r^ a gdod 

picture of .th^se* formats in act i^^ J " * 



"iptfERFUL* IDEAS", THINKING;, MD REH^ENTATION 



• /v At tbfe : ^ix^ : ,^ti^ ] of i tfte class ^|h ; Segt^rd^r 1979 ; . ' we 4 vi^ntedvto try^- 
introducing the ;cbursd in a rew way, Qf ten^'we ^begih ; by^ moving . the' studentsf ; 
immediately ;; l intb the! round-rbb|h fomati diOTonstr|te. thif' process at work, and ;• 
th^n . disc(^| -tte/i^asbn^ :• rThis -time we w^ted to- see — p / t) the students 

could play some past; ij*>^ : ; . v . ^ 



; rc^ behind rmy i ^ X > • . 

begin ^ .Epiriti are marry- subject-matter options, ojpe^ c to vthe ; { 

teScBer of :ir*^©afc^©rp ■ ; j^M%. br for that matteft, ai^r cx>urse^ ; Hew; 



GKUBER/KEEGAN 



it*. 



choose? 



Discussion was lively. We rushed ' the r pace a little and as soon as it 



on a student 1 ,s remark to steer j the di scussion toward 
expressing the strategy" of choosing. "E»werfid ideag.?, il have heard rnuch-rnentipii 
of this pferise in the last year or two, fet.no clear idea given as . to hew to 
know wflen . an * idea is powerful. My ownigtiess is that no idea is of • itself 
pbvet f per son makes jan idea powerful t3y linking it with ether ideas. Some 

of, ; t^is <^ out in discussion. Fai rly rabidly we got around to the' idea that 
the : - ■ most valuable thing that could happen to a pferson in school would be to 
learfe to ^, think better , or to use his or her mind better., 'At this - point ~, I 
explained that that was indeed the primary goal of , the course f and • introduced 
the first exercise f which really involved three steps. 



_ _^F^st| I^asked them to write down a paragraph or so explaining what they 
meant by ^''thinking. " Second — after some discussion' ,of "representation" as a 
powerful idea — I asked them to draw two diagrams, a diagranit of a dialogue and 
a diagram- of . an; ordinary claissroom situation, depicting the pattern of 
coirrnunication-among the paticipants. The paragraph on- thinking was taken up in 
a- later clasp meeting, where we worked out a way* of 'categorizing the different 
responses* arid discussed the classes ideas. The diagrams of dialogues were 
mainly o£ two persons linked by arrows or lines, Indicting that they were 
talking to each other. In what follows we examine only the students' 
of an ordinary classroom situation. 



These diagrams were divided -into the six categories shown in -Table 1. It 
can v be seen that by far the dominant description as a one-way, one-many 
interaction between the one teacher and the many students. -;-/> 



TABLE 1 



CATEGORY 



A. One-way, one-many communication ; 
(one teacher telling many students) 



17 



ERiC 



B. As above/ but with some side-chai ns , 
(one teacher telling many students, 
."'■X some interstudent conniunication) 

i^aE<^^ every student , ; V 

V. . - «encgged in§some inter student : 
, . ^. cornmunic]ati : 6n K- °- 

^adhertstudent iritera^i&n seen - 

: :.as- a set of 1^1 diaibgUesi - V : ' 



E. Tfrb phases: A abbyfe/ arid ii$g t 

; listr^tmre indicting m 
: ' - discussion ;:: 



Un(^a§silEiai^ie 



12. 

St 



4 



v. h 

■• - / ■ 



-GRUBWKEEGAN ' 12.' 



(1) We were probably only dimly aware of the point — not yet having clearly 
delineated the separate but corresponding structures of communication and ; 
oval n«Hon. At the March 1979 meetin g of the panel on Assessment in the Service 
~~ of Instruction, one of the authors described an incident that" had occurred^ 
•• our teaching (see below, the section on perspective-takirig) . The other 
participants in the meeting pointed out to us quite emphatically that our work; 
was a form of evaluation- in-the-classroom! 

By the time the ' discussion of these diagrams was over , the students had 
. moved into round-robin phase: Every individual had done some work alone. Every 
student had given some. description of his or her thinking. A general- discussion 
j. had followed. A /\ •/...'•>:. K : < ' •:. --••••'j' • '.'■'■r'Tt-'''- 

~ If we were doing this exercise again, there are some improvements I would 

' •> like to introduce. First, there should be a more careful discussion of 
representation, i.e. how to make a good diagram. . We might borrow Howard 
' ' Nemerov's phrase, "image and' caption." The students should come to see that ; 
-their ideas can be translated into pictures or diagrams, and the pictures can be 
v : translated into captions. It would not be ifee©sski^. : ;.|o ; : "ii^i^^Wi-Sfc^PBtR .• 
"to do this the necessity for the caption, or more generally, for a multi- 
modal way of thinking and expression, could easily arise out of the class 
./' . process itself. ■ ; , \ "> ■,[■_ 

: Second, the : structural • diagram of 'the communication process is incomplete 

■3 . without another representation — of the evaluation feedback loop. Of t&is more 

^ slater.- ' * - ^ • .:• '• •• ; v ; V- v-.': ' 1 .-. ' " . ... ■ ■ ■ y , : : r ^,-< 

. ; Vv.-. Third,, it would probably be a good ' idea to introduce the distinction 
between real' descriptions and idealized categories. This could have been.done if 
W e had for seen .the variety of descriptions- the students would igive, and the 
consequent need to code and tabulate them in order to make sense of - the ensemble 
; of xesponslsi^iven by potential 

• richness of the students' responses. r 

V Fourth, it would have been good to couple the class, exercise with some 
follow-up.reading about research on conmmcation patterns*. ■ 

In spite of these regrets, it should be stressed that this ass period 
worked well. The students got the idea of the course,, and joined in the spUifc* 
* of it* They got to carry out an exercise in representation, and to reflect upon 
; ■ v , it. And the teachers, fibr the nth time, were properly chastened by the oncer . 
: " again unexpected contpl ex^ty an8 ^interest of the studen^ thinking #0 ^^ 

• • ■ • * At a workshop with college teachers (DougIass**College, Rutgers University) , 
v .-' we repeated the task of drawing diagrams of the typical college classroom. For 
" the most part, "the representations were similar to those produced by the class 

•in Newark , although mai nly of the B-type (some side-chains) . There was .also 
' more explicit recognition of the presence in £he classroom of some* students who 
/ \ j are ''put of. it. 'V; v ' - ; : . ". ' v/ v ' ; 'vv ^ 

' : - It is hard to imagine high level productive thinking occurring without the 



107 



ERIC 



involv^ent : ^f ; the memory system. A thought must be held in mind .long enough to 
rte'-'^^aH^^i^&fe process involves memory; .Mditionally, . the productive- 
thought .'"iiuiSWi retained long enough to be translated into . some - symbol System- 
such as language or .mathematics in order for it to be preserved and recognized 
as a ^rodueti|e" thought. The symbol systems themselves involve tremendous loads 
on memory. ' Statediquite plainly, memory is a requisite of productive cognitive 
functioning. V • ■ '^'^\ 

■ Many inter est ing^Kaad, important questions- can. be asked with respect to 
memory. Investigators have explored the areas of memory capacity) the nature of 
the memory trace, the. structure of memory and numerous other aspects of; the 
topic. However , in keeping 1 with the course' s major goal of helping people to 
• think better , our ' classroo^v'exercises have stressed^ the importance ; of the 
particular strategy used to store t%e "raw material" -which is to be remembered. 
Ihe choice of strategy can be' shown' v to greatly affect ' the nature and amount of 
material that will be recalled. ' ; 

. People spontaneously generate .different strategies for remembering material 
and one of the most commonly chosen of these strategies is simple repetition. 
For' instance, it is possible to learn and remember the colors of the spectrum by 
repeating over and over again the color names red; ■ orange, yellow, green; blue, 
indigo, and violet. The trouble^ with using this particular method of storing 
information is captured in the word "memorization". . For' many of us, the word 
"memorization" evokes recollections of intense boredom, feelings of resentment, 
and images! of hackneyed poetry recited with military' precision* Even if the • 
lack of interest and enthusiam for the task .can be overcome, the information 
acquired through repetition is generally: isolated from . larger,., more coherent 
organizations of information. The critical question then' is, "how carj the 
individual acquire new information .without depending on rote memorization?' 

Various techniques for organizing unwieldy or unrelated material exist 
which can be used to facilitate the retention of unfamiliar items. Although 
these techniques are -grouped under the term "mnemonics", the particular 
mechanisms of 'these various memory devices are quite diverse. With respect to 
the spectrum example described above, a good mnemonic for remembering the 
appropriate "colors is to take the first letter from each, colar name ' and 
construct the name "ROY G BIV" . This . "f irst letter " technique, which is a good 
way of abbreviating information, is quite" common"" and effective - within a 
restricted domain. But therein^ -lies, the pro^iem with all mnemonics. Tire 
arbitrary, artificial connections • that are made between the items to be 
remembered ' are ; inappropriate for > larger organizations of knowledge. - The* 
richness^ complexity, and subtlety --of such systems as Hagetian psychology, ' 
quantum theory, "'or the >«self":, cannot toe reduced ..•^Qr;^'.j'nra|b^^_ _ ; ^^v : "?«WEti^i«fl ; ^E'«r 
relationships. '•• • Z\'~:;>,r< r'l'-)pJ.'.'f -V-v.;-. ::r. r '-'- : ;\':rv^^\r^- -""'/.Si 

Despite the limitations of mnemonic techniques, they do accentuate an 
important point -that the "raw material"- of • experience need not be^ "taken in" as 
it is presented, but can be worked on, transformed, and manipulated in various 
ways and to different degree. While the artif icial, limiting aspects of 
mnemonics persuaded us hot to pursue" this topic through class exercises, we .do 
emphasize the functional- value of restructuring material; . " however, more 
meaningful- forms of > organization than, those, provided by mnemonics are explored. 
Each form of organization or. "strategy" has a different' impact, oh retention and 
each individual can bring his unique knowledge and experience to/ these 
organization tasks. . "' ?»y ?>■■■;.■> '•' - r -- ; ;--', : ;'- : 



ERIC 



GI^ER/KEEGfitJ . " : - . 14 



In pursuing the ; importance^ of what the_ subject does to the "raw mtferial^. 

: of experience in order to ronpt^r it, we_expidited the p^choiogicai experiment 
as a tool for illuminating this issue. After all,, an experingnt is des^ 
inform us. about something- and it therefore coincides quite well with ^dagbgic^l : 
goals, The particular Sesiejn we had _three^di^inet conditions, and tiie 
students were equally divided among these conditions. All three groups were 
told that they should try to remember: as many, words as they could f rom a list 
that they wei;e to receive. Group 1; the "Uninstructed" group was given a list 
of thirty-one words with, no . further instructions than those already given. 
Group 2, the "Instructed" group received. the identical list but with the further 
instruction to organize the list in conjunction witfi « trying to memorize it. 
Grbup 3 , the "Pre-dategorized" group received a list that contained the same 
v/ords as groups , one and two received, but the words were now arranged 
hierarchically. The superordiriate category was "things" with the subordinate 
categories "alive! 1 was further subdivided into, the classifications "animals" or 
"fruit" while "manufactured things" were grouped as either "furniture" or 
"weapons". Several minutes were given for the students to work on their 

. respective tasks. ■ • »■ - - \ : ■ 

The results of this demonstration are summarized^ in Table 2 . The 
experiment proved to be quite useful in demonstrating . the powerful effe<j£ of 
organization on recall. " . • , : ., ^ 

• ■. ' ' ■■.■';?'." . 

: " J ■•>• • '• '' "' ■■■■ ■ '.l''' i r • ■ . •• . ' . ■ 

/. ," ; > : Table 2 / ' ' 

f THE EFFECT OF ORGANIZATION ON MEMORY FOR A WOR£> LIST ; , . ' 

; . GROUP ITEMS RECALLED \ : . ' 

Uninstructed 19.9 \ , ; 

Instructed 22.3 -«. V . 

E^re^cate^^zed ■ 29.4 - ^ 

' , in the :next class session, we shifted focus to a slightly different, aspect 
of the topic. . Memory, does not contain only those things we have explicitly 
tried to op have been told to remember , A myriad of facts and experienced 
reside iri our: memory system. Hew then do we account for which material is 
retained in memory? One factor that can be shewn to have a strong impact . on 
what we remember is the way in which we assign . meaning to a . particular action dr 
task; '"' - ' . ■■_ .^V. ; ", 

* leaning is catalyst for : organization. When we say that something is 
Meaningful", we are stating Ithat it engages the well orchestrated 
system of interests, belief s/^ejudices, needs, etc, that form the 
organization we call the "self*. Any experience that taps into this 
. . v ^ystem will be organized in a more powerful way than experiences that 
v remain . isolated or independent from this organization. One would also 
. \ expect the superior organization of "meaningful " mater ial to result in 
■> enhanced recall_ for suph material s In order to further explore the 
relationship between meaning and memory we again used an exercise 



. " gbubeVkeejgan •" ' is 

'"' modeled after the standard psychological experiment. 

$ word list consisting of adjectives was distributed to the class. The 
class was then divided into mo groups by giving half the class one set of ; 
instructions and the other half an alternate set of instructions. The 
"Counting" group received instructions to "Look at each word, count the number 
of vowels in the word; and write this number next to the word. " We expected this 
task to f be regarded quite neutrally" by. the group.. ; '. "/;l'.- t 

••• in contrast ' '^^^^"^^^^0^^^^^^^^$^ grbup ,receiyed 
instructions to "Look at each word and ask yourslf whether or' not it describes 
you. If it does, put a check next to it." We expected this task to engage the 
interest of the. group. * V ■ % ■ . • '. - : '*'- l 'p<iry -"w.; 

It was necessary to prevent the students from intentionally memorizing the 
word list in order for the demonstration to be valid. With this in mind, the' 
class was told that, = the exercise was designed to demonstrate a " certain , feature 
of language. Also, after the students completed the task, their papers were 
collected and five minutes of unrelated activity ensued. 

'Following this period, the students were requested to write down as many 
words as they could recall from the adjective list. When they completed this 
task, we asked each student, one at a time, to tell us the number of words he 
was able to recall. 

_•' •_ _ - «♦ •• . -'■ • - _- . - ■'. 

There was a clear and dramatic difference in recall between the two groups. 

While the "Counting" group recalled an average of nine words each, the "Self- 
reference" group recalled an average of seventeen words each. Up to this point 
the students were not aware that there had been two sets of .instructions and 
there was general puzzlement as to why there had been such a wide discrepancy in 
performance between the two groups. When both sets of instructions were made 
known to the entire class, there was a strong reaction on the part of many 
students, it became immediately clear that despite the fact thajt the "raw 
material" for each group was identical, the group that had examined , the word 
list in the context, "does this word describe me?" had undergone a more 
interesting, personal, and meaningful experience than the "count a vowel" group. 

In this light, the discrepancy in recall performance between the two groups 
appeared reasonable. ' . : ■ / 

An experience of this type usually activates the class and provides a good 
amount of material for discussion. The .inclination, prompted by time 
constraints and the desire to arrive at a general synthesis, is to follow these 
memory exercises with a teacher led discussion of the issues. This procedure is 
exactly the one we employed. However, by following this conventional model, we 
probably short circuited the individual's process of discovery in developing his 
own strategies and techniques for dealing with the material presented in the 
classroom exercises.. In retr.ospect, we should have utilized the round robin 
format to explore the diverse elements of the various constructions of the class 
members. . . ' T". .' 

Both the demonstration concerning the effect of organization on memory and 
the exercise involving the role of meaning in memory clearly illustrate the 
essential ; point that the "raw material" of experience need not be passively 
registered. It can be transformed, manipulated, and digested. Strategies 
ranging from simple repetition of the given material to use of mnemonics; 



GFIBER/KEEGfiN . 16 V 

brgariizatibri^ arid' self ref erence represent ;/(Kf f jar eiit : ways ' arid „ degrees, of ffiaktrig : 
the "raw material" of ex^rieriee ybut "bwri"i. The isx^rimerits wis utilized give 
the student ari ppporturii^ to "try on" , arid evaluate fthe effectiveness . of a 
number of these_ strategies through tteJlpt^diate f eedback^ provided by tteir am 
recall performance in relation, to tne \ecall performance of others^ using _ 
different strategies. It also provides the Student with another opportunity [to* 
rediscover the fact that a vigorous , . npn passive orientation to cognitive life 
is. iirippr tarit * riot brO# with respect to* iiianory But also in such diverse areas as 
pr^blSn y solving, the recognition ;of .propaganda devices f hypothesis formation r •: 
productive thinking in general, . , ' 

iTOBLE^; : s6LvifiG - • r " '''"v " "' ;? ' ■ " ; '^t: v; 

v 1 ^'Mfchiri the pantheon of cogni tive ; abil i ties ijsj^toe^kJ^i. referred to as \h 
"problon deceptive* 
Problem solving is riot a discrete, singular , process that occurs the same way, 
every time, for everybody, for. every kind of problem, This ;conceptu4lization , of V 
problem solving obscures ' the richness and subtlety of the process Problem 
solving is more accurately concexveB as a _]purposef\4 utilization of ? varied of 
cognitive skills such as imagery, intuition,' mathematico-logical thinking, etc, , 
in a highly individualistic manner. Problems are also individuals, They vary 
in content, complexity, and ' in the time needed for solution. We chose to, 
present problems that seemed capable of solution well wifhin the time 
constraints of a single class session. 



Although we explored problem solving through the use of fairly restricted 
problems presented one at a time in the hope that this simple situation would be 
conducive to an examination of the solution process, we will probably extend our 
focus in the future. It would be .interesting to present problems that are of 
sufficient complexity to engage a student for a week, month, or even an entire 
semester. A task such as this would certainly better approximate hew problems ' 
usually occur in real life. This does riot mean that we should abandon the "half 
hour" problem, but that we should supplement it with problems of another scale, 

£mong the obvious forms of feedback in a problem solving situation is the 
actual solution or the response "right" or "wrong" from some arbitrary source. 
However, we shifted attention to. an examination of the solution process itself. 
The class was divided into several small^ groups of four to five persons each, 
and the general instruction was to freely exchange their ideas on the problem 
and to keep, track: of hew their thinking changed over the course of the problem, 
thereby constructing a reflective record of successive? approximations to a 
solution. One of the problems we presented them with was a problem described, by 
the psychologist Karl Duncker aver thirty-five years ago. The problem is a 
follows:-'. •' ■ '* I ; 

. . A person has a stomach tumor which cannot be treated surgically. 
A beam of radiation can destroy the tumor, but the beam also has the 
property of destroying the healthy, tissue that lies. between the beam 
arid the tumor. How can this problan be solved? 

Eased on the responses from the class,, the solution process seemed to fall 
into several \discernable stages. At first, there were several^ requests^ fbr 
restatement of the problem in order to. insure that they had - '-gotten it right". 
Following this "confirmation" phase, there was a period in which the majority of 
solutions either ignored or violated certain premises of the problem. For 



eric 



instance, replies were along the lines of "make an incision, and focus the beam 
directly on the tumor" or "treat the tumor with chemicals instead of radiation . 
It was pointed out that while these solutions- may be viable, they do not adhere 
to the limitations imposed by the problem. The problem explicitly P^ 1 ^ 
surgery and implicitly excluded going beyoh$L£he historical or state the 
art" constraints^ thus eliminating the chemotherapy option. 

The" next\£rent phase displayed a^ : strong ; tendency to concentrate on 
orotectinq the hlSthy tissue of- the body, such as applying a screening salve to 

solutions are not held for long because it becomes readily apparent ; t^at 
although they are Successful in protecting healthy tissue, they correspondingly 
elMnite thiabifity'of the radiation to effect the diseased ..tissue, ^en if. ^ 
it were possible to allow the radiation to pass through the skm without harming 
it (selective protection) , the problem of protecting the intervening internal 
organs would still remain. 

At this point in the solution-process an interesting thing occurs. Having 
had a "first go around" with the~problem and coming up short of an answer, some 
students seek to distance themselves from the problem by giving up on it, or by 
contain? that some "gimmick " must be involve* Tne . latter^ reaction , seems 
quite reasonable in the face of the common past experience of . vta^ Jjeard ■ 
similar types of problems which turned out to have punch-lines instead of 
genuine solutions. 

• For thosfstudents who continue to pursue the problem (even the small setup 
does not prevent certain students from "dropping out" of the exercise) , a 
curious Ihift takes place. A good number of the solutions now ^offered involve 
plttin? thepaSent's body into motion. Tne question arises-,- "What would happen 
if you rotated the patient's body so that the same spot on the outside is not 
conSnuoSsty con?acSd but the Sme spot on the inside is ^continuously focused 
upon?" This line of reasoning represents a functional solution to the problem 
S Certain assumptions. For. instance, in order for this solution to be 
qenuine, it must be assumed that the beam is weak enough not to cause damage 
Ser^onditioris of brief exposure (as is the case with the surrounding tissue) 
but stronq enough to have an effect with .longer exposure times_ (as^is the case 
at the point of the tumor) . . Since we aife interested in the solution process 
itself rather than getting the "right answer" we encouraged the class to 
continue with the problem. We informed, the students that there was another, 
plrhll more elegant solution to the problem and that they, should try to 
formulate it. ■ 

While several students reverted to earlier ^ solution in a_ si ighUy 
different form at this point in the exercise (i.e. put a tube down the throat 
and I "pour" the radiatSn into the stomach) , other students stayed with the 
notion^f keeping the problem in motion. The critical development- that occurred 
at this time was a shift in attention from the body to the radiation. 

The first solution offered after the shift of focus to the radiation is the 
converse of the "rotating body" solution. This new solution im0 ^ e ^°}° 1 S^ 
the body in a constant position while the beam is rotated around the body. 
. Mtiiough this solution is very close in form to the "rotating body" solution, 
Se ground-work has been set for a "final" explanation. Tne problem has jseen 
firmly established as one of focusing a beam on an inner location while 



ghjbhVkebgsn 18 - ... 

^ /Protecting the surrbunding regions. By Ratting the problem into motion,; the 
critical idea of ^changing the location of wliere_fiie beam contacts the tody has 
been brought into play. Attention has also shif ted to the "_beim ; itself The 
realization soon comes that there is another way pf changing the ibratim of the. 
beam. This change involves not a successive change; in iocation f but' a 
simultaneous location change through the use of multiple beams at the same time. 
The idea of lowering the intensity of the individual beams in or to meet the 
requirement of effectively treating the tumor while protecting \ the surrounding 
tissue follows fast on the heels of the multiple beam notion . The prdbl an has 
. been solved/ but more importantly a unigue opportunity to critically examine a 
v "piece" of thinking has been provided. W • 

'" 'WING ANOTHER'S POINT OF VIEW V . * 

The act of seeing things f ran another person 1 s point- of view is a central 
theme of the whole course. In almost eveiry class meeting there • is an 
p ^opportunity to do this and to reflect on the results. But we wanted also to do 
some work more directly aimed at becoming aware of the process of perspective 
taking. In the fall of 1977 Camille Burns and Howard Gruber . planned a three- 
unit sequence with this end in view. The plan was as follows: .;' / 

a. Understanding poems in which the meaning turns on a sudden shi^t in 
perspective. We planned to have the students read first a very simple poem 
and then a more complex one. After they had understood each,- the next task 
would be to discover what they had in corrmon (i.e.^^-sadden shift in 
perspective).. A 

b. Struggling with moral dilemmas in which the question of what is 
right depends on whose ox is gored. .The moral dilemmas were brief 

■-■ anecdotes of the kind invented by Pi age t and by Kohlberg to studjrthe 
development of children 1 s moral judgment. * 

c. Waiting a dialogue about a perplexing social issue in which the 
student is required to shift perspectives as he or she first writes one 

; } ■ speaker f s lines then the other's. ~ - 

It- should be stressed that > we were distinctly not trying to inculcate a 
1950s social-science "objectivity" or non-partisanship. On the contrary , when 
the' time come, we tried to bring out the idea that understanding Other people is 
important in order to struggle well for what you believe: to clarify your own 
ideas, to discover your allies,, to anticipate your opponents. But the first 
step in all this is to understand the other . 

Complex plans are risky in a teaching process predicated on inviting the 
.students to think. What if things don 1 1 ga as expected? Must thfe whole plan go 

out the window? In this instance, that was almost what haj^ned. 4 

: • • . k ■ ' ' • ■ • ■ . - „ . 

... . r .. ■ ■ _ . _ . 

The first'poem we used was "Quatrain" by Sarah N. Qeghorn: . 

The golf links lie so near the mill ■ ' 

"": That^ almost every day 
The laboring children can look out 
. And. see the. men at play.. , ... 

We expected it to 'be very easy to understand. Half an hour at most of 



GRUBER/KEEGAN 



19 



ffiat should leave pfaity 
what the two- poems had 



ERIC 



Quatrain and we could get on to Sheila's 
of time 'to finish the period by collectively" 
r in cbniribri. ' . * ; ; v 

- } To my surprise f " Quatrain ' turned out 
The class was unfamiliar with the tem^golf • ^lin^ was easgi i 

.translated it as M course". _' They didn^ quite grasp that golf was a^jrich maiifs 
game — maybe it isn't so ijiuch ^ inor eV ffife di<& 1 1 u ; fcriw about the hi story of 
the struggle for laws prohibiting child labor. • And "probably a 
clear* idea that the function of poetry inighti be^ to \ voice, social protest. / These 
points emerged as we went aj^tigid .the room Mid each student ^ gave , his \ or _ her- 
interpretation of the poem. As ; each ^ddent jspbke^ betraying i;a >ealtii of 
unbooked f or misuuider standings, my dismays grew. I was dejected not so much at. 
the plan going wrong, but at the low lfvel pf culture I per6eive e d in the' group. 
And I was on the verge of .ccpmitting what /Would bfe^the 'cardinal ^ siri within this 

' method of teaching ' — simply lapsing: into telling the class* "fight answer". .Oi 

But I persevered. I provided some Cultural ^d^«h : i^6r ic^l* backgrpuhd ; the 
class bategorized thei>r different interpretations, . andjtheri .disppssed^^onV^/f^ 
interpretation was eventually incltfSed ih the list f Vbut > J did everything I qould 
to avoid suggesting that a poem has only one, right meaning. .Tfte pferiod ended bn 
a note of irresolution. " \ 

The next period, I was still- t€^pbs<5 to go back<» to Quatrain and insist on 
the right answer. I resisted the urge —- we had /had- enough, of thb^ four lines . 
for awhile — and, we went on to Shell/ey 1 s. Qzymahdias . To me, this poem se^ms 
more # difficult than the Quatrain : Ipnger, -more complex, mor.e . exotic. , But on i 
the whole, the class understood it quite well : ; that is,- th<?ir understanding 
matched mine fairly closely.,' I had learned my lessdn and* we. took our time; 
Listening to the nuances of their differing reactions* I heard things I had never , 
noticed in th? poem, althoucjh I Have 4cnown and re-r6ad it over . a spatf of 40 .. 
yfars. The students moved me by" their insight, anfl my spirits lifted. Hit Wfe : 

• ran out of time and still; had not gotten^ to the question* originally planned: 
what do the poems have in .cornhon? -{Right answer: a* sudden 

view.) I asked them to write out a paragraph- for the next class;. dealing m 
t*hls question, (they had copies of the poemis) . ." H \ 

{■ At the- tHird period' ip the sequence,, we went round the room again. Ye^, 
some vof thQTv got ♦ the "right answer". ^But far more important, some of ^e > 
students discovered something else the poems have in cxsnmon ^ both deal with 
power: So by the tiifie we were done wife this L the st^ twb 
poems, perhaps more carefully -than ever before in* their lives. hadW seen 

the many interpretations possible,^both widely taftgi ng -and sounding 3^^;fine 
nuances, making the * ideas of' the pet son next to you worth hearing. 
learned a lot f ram the ^roup, and my" interest in their ideas ; was important to 7 
then; And, , albeit _a bit slowly, we had come out of .it in a reasonably good- 
position "to go on with the origirpl ^'an^ : The next steps went very well. The 
second task, moral dilemmas, were marvellous grist for the mill of our circular 

"•process. The -third task, dialogue-writing, was diff icult 'but hot overwhelm 
so. The class was mostly Black and Hispanic. I had chosen.; as the material for 

/the* dialogue a letter that had appeared in the 

• wliite- person, arguing against affirmative 
group njenber s are given preferential tr< 

.Everyone* found it easy to answer the author.. ^ v ~— - — ». — — ■.- 

. an^weD the author, ; then to give the author 1 s reply, v and finally to have the last 

'v- V' r : - : ■ 114 



udeht new^aper, evidently by ; a 
ion programs in whicfr minority 
ient in employment practices. 

it° the task we tad set was to 



GHJBER/KEEGAN - : - 20 „ ': - 

word. Seme of the students were reluctant or found it difficult to frame a real 
argument for the opponent's side, and resorted to having the opponent s§y 
merely, "You're wrong." in comparing the widely varying productions of the 
students, the weakness of this strategem became evident without my -pointing it 
out; flier e was a difficult, thei§\but the class^overcame it. 

.< That year, the students' interpretations of thy poems were gj^en orally' and 
I have no exact record of what they said. In 1979 Bob Keegan arM I used Sarah 
<2Leghorn' s Quatrain as part of a somewhat ; different exerecise which began* with 
having the students write out their ideas of what the poem means. 

ihis group came a good deal closer to agreeing on, the interpretation of the 
poem as a protest against child labor, possibly, because of the way we set the 
stage for the. exercise. Nevertheless, the interpretations cover quite a range. 

STUDENT INTERPRETATIONS OF SARAH CLEGHQFN' S QUATRAIN v' 

M.w. - • ' ; .'. •'•'•••V : / - ; .. . 

Children are not the only ones who play, or fool around, but grown-ups also . 
have the need to enjoy and play at. some point- or other. \ ' - - . 

N.H 



: The poor suggests that children ar§ working but not so far off they can see 
men playing. This could mean that young people are going through certain cycles 
so that they can. be where these men\are. For instance going to school to get an 
education will soon earn young peopM jobs that are being held by adults at the 
."present time. , ; 

J.W; ' '.•/:•_.._ ;; v / r: - . 

Children are hard; at work, while grown men can find the time to play golt.' 
This is as if. the men want to rub the children' s faces in their poverty. 

R.L. '■ • ■ -'^V .;-/."/ _' _ _ ' 

. It seems as if t±e golf coucse is so near the mills, or working area f that 
the childrert v^ho -aire wdjrking can see the men playing golf. 

' V.L. _ ' : T .;'VV \ ',- '■ _ - — ■ - 

. The point of view, is reversed. The men should be working (laboring) rather 
than ^ fhd children^ children should be playing rather than laboring and 

watchingthe men pi ay .on the golf links (golf course) . , 

(Summarizes poem about same as R.L. ) : I think it would be better if the men 
Were working and the kids were* playing. Or at least the mill shouldn't be so 
neat to a recreation center r because it fyould make the kids feel sad. : ". ■ 

j;;B. . _ - _ • " ■ '• -:• v ' 

From children's point -of view, the poem seems to suggest that looking out 
to r !See* the Irian at play is something, that is taken for granted. The children can- 
see, their view which includes the golfers' view too. View not. absolute as it 
encompasses two views too. It is a relative view. 

■ >. \;. • • : / ■ ' . * ,.-- :v - ' . 

R*S. ' j ' ■ . . • _ • * ■ ■ " — - • 

During their chores the children observe the older men playing golf. Makes 

me think of inequality and bondage* 



GHBEIV'KEBGAN 



21 



L.M. . , : _ _ ■ _ ' • '. . . ■ 

(edited slightly) I gather, from the word mill that they mean something tp 
do with wheat or grains, because that's what goes on at a mill. It doesn't take 
much of anything to work at a plain old mill. Prom laboring children I get the 
impression that the author is talking about slaves. The ; men at play are" rich 
men (white) , play golf watching the poor children (slaves — black) working*- -I 
mentioned earlier that it doesn't take' much of. .anything to work in a mill,, 
meaning the. stereotype that blacks only had muscle and no brain so only labor 
jobs would be issued to blacks (dummies). And realistically the game- of golf ; is 
played mostly by white people. . *■* 



*B»B.' ..... ; - _ - : . • ■.'•.*• .-l^.^-l.- 

• : v There ar^wealthy men playing gojLf on a golf course. Near the course was a 
mill :where po^'families sent their children to work at the mill to help support 
%e family. And everyday the children, look but ,at these men and wish they could 
play ahd not work. v • , 

M.D. 

Things are reversed.. Instead of the men working and children at play, me 
children are working in the mill and the men are out playing on the golf coursfe. 

P.R. :i • v I 

I think that this poem is saying there is a certain irony between the 
children laboring and the men playing. Usually you would expect this to happen 
in. reverse. This also suggests that the children are .poor and the men are rich. 

G EL 

* This poem was written many years ago. Children didn't go to school because 
they had to help support their families and themselves. But the men who were 
well off could have leisure time to do things such as play golf. 

B.S. • .« _ _ _ ' __ V - . 

AH the sifbjects were adults, but were categorized according to their 
wealth... Children as- opposed to men symbolize the superiority of the playing 
group. .. ; ' 

k s • °' i 

,..a group of ' children who are busy at work... can see men playing golf, 
near where they are laboring. - . 

D.R. . - . ... : • - 

It means that while the poor children are working every day, they probably 
wish they were playing. Instead, they just watch the men relaxing or playing, 
' 'le they are worR' 



* Children' in the days of the depression worked, in sweatboxes. These 
children work in a mill. Looking out of the factory windows these childlren are 
working at normally men's jobs. The men are playing a childish game, in this 
case golf. 

. . ■ * ■ ' 

M.F. ,_ _ ; _ . .'. ... 1 _. • , 

Children are hard at work, while men are busy at play. Should be the other 
way around. 

■ ■. ; lie ■ • / 



: • • " . » . • . ■ - 53 . 

P.C ■* 1 

. men pi ey ing golf on the golf link^ bedanse of the labor! of children. 



E^)TIGN f COGNITION MSB REALITY 

Our basic strategy Vis/ tb single out some aspect of cognitive functioning* 
; :: : ://v,:--- ; i(fevelo|5' A task situation that calls upon that aspect or sub-skill f and draw the' 
^/';y/ ; ^.^tudeitts l attention to that domain. But any real performance, of course, draws 
■ ■■•\Q\:xxi,i varied kinds, of knowledge and^skill.; . The; focus of the course depends ; not 
^yi^j^^:^ the^List pfytasks propo^d^and on -the impr editable : interplay -amcaig all 
: - ■ ^ lidse ■ but also on the . teacher 1 s! emphasi s in st eering /the cl^ss. one way 

While our main emphasis is on irfeeiiectual fmctiaiihg itself f we are aware 
the vital relation between cognition and emotion. This relation becomes 
ramount as one tries to think as well as possible in real, human situations, 
different ways f . some of the exercises we use are aimed at increasing 
awareness and control of this relationship. We give here only brief indications 
of some of our efforts in this direc&ori. . 

ANGER. We ask the students, • "try to remember some incident in which you 
were angry at a teacher." Sitting quietly, each student writes out notes on his 
or her recollections. * Usually f no memories come at first. After a few minutes 
they begin tp pour out. Then' we go around the* table with each student 
v reporting. This tarns into a very lively discussion and could occupy many weeks 
if we let it. As . the session goes along we steer attention toward how the 
students had . handled the situation in which .they found themselves, and* 
eventually to reflection on the availability of alternative strategies for 
coping. 

^ BEWILDERMENT. This experience grew out of a planned activity that was 
° side-tracked by the spontaneous course of events, -it-iplght. not be repeatablef 
but. the general idea is interesting. One semester , w<* wanted to draw the 
students 1 attention to he*/ they listen to a lecturer take notes, and use those 
i- notes. We had a plan for this sequence which we never completed. The first 
step was for the students simply to obsereve themselves in any other class and 
to come to our class prepared to describe hew they listen^ When we had. the 
round-robin, / it became clear that they all felt bewildered, overwhelmed, 
baffled, and finally bored by most of their lectures. Ihey felt the teachers 
were "snowing" them,' and not paying attention to the students 1 needs. 

We offered than a choice. Either we could try to work out ways to listen 
as welt as possible in such situations - in our view, not an entirely 
. unrealistic plan, since so much of life is like that. _ Or we could work gut w^s 
to try to change the situation. The students chose the latter path. Together 
we worked, out a simple plan - nothing more than raising the issue with, the 
teacher in question, either before, during, or after a class period. Eacfi of 
our students took on the responsibility of trying to change a class! 

This was one of the few tinjes that most of the students in our course 
: failed to do their homework. _ A few did do it, and everyone f s reflections on the 
difficulties experienced in doing or riot doing it were of great interest. 



ERLC 



117 



INTRODUCTION, TO PART IV * 

- ^ in this iedtibh we - turn bur attention to a reasonably concrete illustration 
Gf assessment that departs f rom^preseht practice and is consonant with thy views 
expressed earlier in this volume. 

It is by- now clear that the members of our study panel frold^the yietf that 
the - distinction between: assesafierit and instruction is largely- artificial and 
arbitrary. Billiard (1980 personal conmunication) for example ; in discussing • 
this point says: V ' s 

"A -testing and assessment system can be built without direct 
.;. r Reference -to; learners v When this happens f the "correct" logiQ and 
: I, content of answers to questions are assumed to be known in advance; -by 

<- , ■• -v - the questioner, 1 The goal ; bf testing in this case is £o determine" if 
> 7 <: l^rhets agree with .questioners. A testing and assessment sy Stan may 
also be built to use the learner 1 s repetoire for building questions.; 
This has sometimes been referred to as response-contingent testing. 
THE KEY POINT TO BE MADE HERE, HOWEVER, IS NOT A POINT ABCXJT TESTING; 
PER SE. IT IS REALLY A POINT ABOUT TEACHING. Any type of testing 
which is selected will fit a particular philosophy of and approach to 
teaching. Paolo Freire has described two different approaches^ to 
. . " teaching. The "backing" approach is generally manipulative. Student 
are said to "learn" when their answers to questions match those>£th 
/ . " which the teacher begins. An alternative approach is * called- a 
"dialogical" approach, Students are said to.: "learn" under this 
approach when they become probleirr-posinq activists . Botfn questions . 
and answers are rtew - to . both teacher Sj and students. In the first 
approach, the teacher's role is to "donate" the material which the 
student is to learn. In the second approach, the teacher's role, is to • 
establish a true^ dialog between teacher and. student. ■ . 

These are not/mere theoretical matters. '_ Paolo Freire is" astoundingly 
successful. ir/ using dialog to teach literacy and prpbl^n solving. 
William Johntz and teachers who are - trained by him are equally; 
successful in teaching low income children, from any cultural group,, 
relatively abstract mathematical concepts and skilj.? where others had 
failed to teach arithmetic before. Iri ,both cases, v "testing" or' 
/assessment is ongoing. The teachers and student^use the students 1 
/ repetoire as the building blocks for learning... - ?■ . : "i- . : v « * 

Here are sane examples of ongoing assessment... P&olo Frie^f 
/ places great stock in -listening to his students. He listens to detect" 
those things about which they have strong feelings. He listens to 
record the vocabulary which .his students already know. These two 
parts of his systematic assessment process are 'then used directly iri* 
/ instruction. Students learn to read (in about 30 hours "of 
instructional time) by using their own ^ords and by focusing on issues 
of importance to them. William Johntz and Project SEED teachers place ^ 
great stock in Mst c ning for student logic and for student 
: • ' assumptions. They also listen for full participation of all students, ^ 
< ; They observe exactly where each student 'agrees or disagrees with each 

step of the group's problem solving effort; They observe if students 
are willing to argue for positions which they hold, even if -alone 
against ' the entire class. : These and other data are collected 



us 



^stematicaiiy " in, . order to design jthe "in f I igh^^r rectibhs 11 b£ the 
teaching^tr^tggx. *Mlliam Johntz ahd Project BE© a^_direet a ^art 
. of ^eii^ ingoing. y systematic :*:e^ process 
- toward the^i^acher. Peer- cfcitigiie is ; always used, and it is done 



i^truGtiori . r^uires that neither ' 
^ I be secrete Secrec^: is a^thema*1:o educatipn;. If assessment . is to indeed, serve r 
insCrucltipn^:- then^' ways must' be f ound f or students ^eachers^ and parents; to 
<3§give lii^gtit and inf oration f ran ti\e assessment \practice|; 5 that are ift fact;, 
anployed. .V.: - I %■ '^H;:-*- : - :-V." ..." ' "•".•'-J.;'.'. 

^ r "i^Jn the context <)f ; 'assessment in the s^yie^ 
• ; : in f avor ' of ; : open asses^^ (Time? \ 

: Educat i tinal Suppl orient , *£on|ohV Nov* ^16><197S) wfto>^ys;> >■ - . 

:1 ! ' • .. ' • , : :-IV\ : .V !, :'j' w ' "r_ :-*:-"r"j?-^ '■' \ - . '. "■: : ' Vi' / : ? ■ *% 

: # ; all have important stakes ih the results- of. the tests our^spciety 

administers. Sane * df ; ; ;iis; arfe!Vd^ student^, ^parents 

i/ '-• . i.£j>> teachers! sch^ ^dmi^ Otfi<?r- 
>.% interested parties ^Qofc j'^^i^ltrore^pr : less pbl itagal power f; pufel ic 

^ ' administrators,' I^islatbrsfe academics,^ 
\ - " % : '. interested' citizens 4 aiidytax^yers.; • ' /*>"-i ■/"■'k^ "V/ v : v ,: >^V.' 

• »'vV Secrecy of 1 tests* ^ects^^mecessa^ walls i^at T hinder the mahy- sided" 
v ' interchange among these interested parties. S&c^ecy aggravates 

\'f inequalities that alreac^ -exi^^ 

and student,- -or bet^ 



; 'iii>bur; ^octety test results are t^ken to -be indicators of. success and ' 
; • v V worth f^or -inc3iyidual$ .^d ^ That ' is. what makes the; \ 

' secret of these^tests so uniquely perverse arifi damaging. .. • Since , that ; 

v secrecy/. • is also ; unnecessary , \ its> : eliiniriatipn; should ;hav^ : . a high 
, .'■ ; V priority in^public discussion and -public policy'. > " : ; c « ^ 

■■ ■;■•> .;• , » ■_ • " .» * _ ^ - . . ' • ' *- ' . ■+>-•", • • .. r ■'■ ^ 1 

J -V.: \„ ^ ^Freedom, of ' information ^acts and other legislative" ranedies are .steps 
: :-- w. ^ forward, . . although;; -few . "patents f for e^ple/ have; / thev^novirledger " 
detenrtination f or ; resources to ; ihyoke ^ such 1 aws . • It may &e that, test"; q 
^ecrepy will finally be eliminated only .after major cdtii:t fc gases result v 
^ - ill substantial damages boing gaid td . ^ 
. 7 their injuries in silencer ; " - ' •. — \ 

In addition- to openness, sit is clear tha new approaches to asses^^t in v 
ithe service of instruction demand. ^ ? re-examir@ti on of thg notion of ' ! ^v%i|i1^" , 

a& appl ied to the design and use :&£ assessment instrumdits. In -the jraper 3 ^;t?y 
r Schwartz, Tlyjbr and Vfillie, the mm: o$ project; TORQUE : on this mportaf^: , 

methodological question is presente^d. r - > • ' x • 



a X 



^ i ' Pr« j ect -TQIOUE . : was a " research and ' devfelopment effort supported by *thfe 

C^fnegi^s s Corporation Vand' . Itie Ford r Foundatick^ that-, was charged with *he*. in 
^Responsibility 6t de^itoing alternative assessment techrti^ues and instruments;; 

for- elem&itary ' school ^thematics. ~ In the course of this'- six . year project, 
" different approach to v^lid^on was evolved; one thpt* 1 wds npt ; * 
- even, statistical, but ra®r categorical" - iij nature. \ t Subh ' ai? ..agproacSi to? : 1 
■"; - validation seons tp have produced technicjties and instruments largely ^Iteg of the ^ 
* ; ; flaws of more traditional approaches. : $ . ' ' ; v.o;-;^r;;^^f : 4 *;- 

ERIC"- " : ' : . = , ; >: : x ' •■ ^^i^ ,,•>:■.■■ 



■ ' • ' ' ' . ' .y 6 t . ' . ^ / '" " 5 " • . ... v * ' ' * .« .' . 

■'•■v.-' v .■ ■ •■. v- - ; '. • . « ■ * - .' ' -v ■■ 

..-i . The paper deiefeibinf ' gr^jedt 1 TORQUE . that is .presented; ir> this sect jon 

* ••" xlelrl^ taSs%^''c6ristitute a I^^M^M®**** who .is-c?oncemed r with ;the 

:<>pen. We hope, however-, that' it will provoke careful, consideration . of what we. 
JpSeve to be some important principles to be ^nsidered, j,n any such effort and 
:^ ; in its work. They 'are: . • 

&L:W : '. : Y in ' an, instructional context,, ^di'stinctiohs "between assessment and 
:■■ instructioteare^^Mi;rary and artificial. v- : / 

■ : ti v ,4 /i^'..:.;'.' ' : - : .■.■•:■■.■;■.'.>>..:„■._■ \;. . . 

* ••' •' -''f-v&m inferences I aMfe .fersonts. faibwledge can only 7 be. made witting ; 
^ ; ; " , • % f raml^ of . the task , and the ,-, 

:,~ /;: ;:v stMo^ts^ ' ■ ^ . 

e-.' self* examination and seeking out 

of" cri tica-1 r a^isal are ; al],, evidence ; of successful educational 

I ' lfi\^y': A , r iti^k^&^'^^^^^^^ of an individual or of £he species, is 
/.•v., '"^V\;'. ' -.'impeded by secrecy'i ' t : ' • ' ' '' 



2.. 0^ . 



■■■+ A ^ 



5 



ERIC 



SCHWARTZ, TAYLOR & WILLIE ■ . 1 i 



Project TORQUE 
An Example of Categorical Test Validation 



judah L. Schwartz 
Edwin F. Taylor 
Nancy A. Willie 



I. INTRODUCTION *" • . ' rr- ; :; , • 

V People who make decisions about other people's lives have social and 
! political power. Insofar as testing is used to influence these decisions, tests 
are instruments of power. The pervasive use of tests in the United States has 
bred much criticism (Houts, 1977; Hoffman, 1967) . This criticism has had some 
results: advocacy groups, educational reform movements, legislation., and 
regulation, all of which seek, by one means or another, to protect the rights of 
individuals from "the tyranny of testing. " 

in the short term, scrupulously responsible use of currently available 
tests may help meet such criticism, but a longrterm solution requires more 
fundamental reform of test development and use, a reform whose seeds may now 
find fertile soil. V ; & 

The work of Project TORQUE* described in* this paper was motivated and 
guided by our concerns about the role of testing within the larger societal 
contexts in Which it occurs. A' pluralistic and democratic society requires 
tests that are subjected to the scrutiny of many "experts" and the public-at- 
large, for whom' testing has social, political, economic, educational, and 
ethical consequences. We write- for thdse who share our interest in education 
and society, and not only for • prof essionals in the fields of education .and 
; testing. • • ;. ; ;/ '.. v.' '. •• "; ; ; \ ■ • , _ . 

* a research group at the Education Development Center, Newton, MA 02160, 
supported by the Carnegie Corporation of New York and The Ford Foundation. . 

This paper outlines the foundations and traces the consequences of several' 
assertions: .. . . : ; . .'. ... ' ■ 

(1) Testing cannot be separated' from an understanding of the 
• .task being tested. Test-making is, in large part, the act of seeking 
1 understanding of the domain being tested and of the ways people 
^demonstrate their leaning of th^t domain. .•/;,,;:.„,-» ••• 

. . V (2) -Some learning domains can be analyzed into tasks and 
subtasks on which performance can be observed and categorized as "all- 
or-none." . : . • . 

, \ (3) "All-or-none" sdfetasks, when they arise empirically rather 

^ - than arbitrarily, are useful ^jLn describing performance (testing) and 
helpful in improving performance (instruction) . 

II We apply our theory of ", performance categorization to the. domain of 
mathematics^ learning, specifically to the tasks of making measurements of 



SC&{ ; 7ARTZ f . TAltOR ; 6 WILLIE .. , ... . .. ; . ; . . ' : . 

length, area, y^y^^ M ''^^^^'^^^^^^^^^ 
several tests of performance on these tasks. ( N. B. We reserve . . toe , wof u 
MEASUREMENT for the application of numerical scales to. physical quantities In 
evaluating human performance, we try to use categories rather than numbers.) . 

The following sections outllni our [^^ya^fWi^^fiv^^^S 
categories, discuss the; implications of that theory -for^ test ^elo^ent^and 
validation, provide art account of the process we designed for test development • 
and validation, and consider the general izability of our work. ■ : : 

^:-r'-'^-i^Ct^isfeia ' common to. most testing practice is «^ reporting' V^a 
: ^test result as a number -ort- a' scale br.- a: ^reV V We- believe- that this, application . 
" of" numbers' td an individual's skills' and performance is • 
the use • of numbers in'this way confounds and misdirects educational^ endeavor s 
and the development and use of tests. We outline briefly^the arguments in order ,, 
to set a theoretical stage for the- categorization of performance -described .in, 
the following sections of thisvpap^r>,^ 

in the natural - sciences numbers . are used., to describe two kinds of , 
quantities. Discrete quantities, : such as the number of apples in . ar ^sket or 
people in a room, are countable. Continuous quantities, such a the distance ,- 
;¥ijk Boston to San Francisco, are measurable. The followng acts; are • necessary 
elements of any such measurements; ., ' . - 

• ' Identifying the- attribute of 'the object to . be measured, and 
. V ' ■ distinguishing it from other attributes the object may possess; ■• 

' choosing a u^^^ '. '. . . ' 

:■'•}•:.■- Comparing the attribute to be measured with- the unit ; ( - ■'; - .-. '-/ - y- '- ',".'. 

, '\ ' judging a level of precision ; appropriate to the context of the " 
. - . . ' ... measurement. ■ ' : z ; ■■■ y : : : .;.< : 

. We do not Consider* any situation in which the attribute" is defined only- in 
terms of the instrument used to "measure" it as being an. example of ; measurement. ' 

Thus, Eorinq's "IQ is What IQ tests measure"' is in our view, _at best, • 
• tautologous. The " attribute . to be quantif ied, must have some : independent 
definition. : v- :; ;;': '.'•*•." '-.';'• • j.'-' v "..' '.". '; : V' > 

'^Assume for the- moment j that 1 if is'possible^ to identify a distinguishable 
attribute possessed by an individual and that. one wishes to measure it. Is it 
possible to define a "scale" that can be applied- to the attribute?. The _ use ; of 
numbers to describe quantity rests . on the. assumed existence , and appropriateness 

' 6f SUCh SCaleS. '. ""'■ ■ ■> ' '."'.'''-'"'. ';; .. ' 

Traditionally,- scholars have referred . to nominal, ordinal, interval and 
ratio scales' as being suitable for the measurement of psychometric variables. \ 
We believe that only the ratio scale is a scale ' that • permits measurement.- 
Neither nominal nor ordinal, scales have anything > to do with measurement, except,, 
in a loose metaphorical -fashion. Nominal; scales simply assign ' numerals to 
objects on the basis of whether or not the object possesses a paticular 
attribute. For example,, - a nominal scale could assign - the number 7 to a±i 
objects that are pink and the number 10 to all objects that- are green Ordinal 
scales simply rank-order objects according. to the amount .of an attribute which 



9 

ERIC 



ERIC 



•SCHWARTZ, TAYLOR & WILLIE 3 :■' ' r x/ 

they assess. For example, glass can scratch steel, and steel can scratch/wood 
ThuT glass, steel and -flood might be assigned the. numbers 1,, 2 and 3 
respectively* because they can be ordered by "hardness". The interval scale- 
concerns itself with the intervals between the extent to which objects possess 
an attribute. Standard intervals, called measurement units,, are defined, in 
term" "f ^ standard of comparison. A common example of • an interval s c alg. L is_^ 
the Centiqrade scale for measuring tenoerature,- in which the difference between • 
10. degrees and 20 degrees is- equal to the attribute being ^measured. The 
arbitrary zero point of the Centigrade scale should not be confused with the 
fact that there dees exist an absolute zero the temperature scale i. e. _U 
deqrees Kelvin. The existence of a non-arbitrary zero point which implies the 
ability to" distinguish in a categorical fashion the presence of the attribute 
f m its absence, is, in our view, central' to the identification of the 
attribute. Only a ratio scale has this characteristic. • , : 

A ratio scale, has the following properties. First,' there ^ is a non- 
arbitrary zero point. Second, the ratio scale can only be applied to one 
dimensional attributes. One cannot order unambiguously points in spaces of more 
dimensions. Third, one must be able to quantitively define what is meant by the 
interval, a "little bit more", of the attribute. Units, such as "one degree 
hotter", or "one centimeter longer", or "one second later" must exist. Ordering 
is insufficient; scaled comparison is necessary. -Without such scaled 
comparison, measurement can have no consistent numerical outcome. 7 

The concept of "a little bit more" cannot be quantified and applied to 
individual human performance, even in eases when, highly refined and specific 
subskills are identified as the attribute. For example, we identified the 
"subtask" of usng the zero point on a ruler correctly when measuring the lengths 
of lines. For this subtask, performance can be described / by performance 
fractions." (the number of correct uses of the zero point)/(the number ^of 
opmrtmities) fterf ormance can be ordered: 4/10, 5/10, 9/10 and 10/10. One 
mult be able to say how the interval, say, frm 9/10 to 10/10 compares m size to 
the intereval from 4/10 to 5/10. The intervals themselves must be capable of 
being Ordered if there is a true ratio scale. Is the student who gets^ 5/10 
correct superior in the subskill to the student who gets 4/10 correct by an 
equal amount of superiority" as the student who gets 10/10 correct is to the 
student who gets 9/10 correct? Degrees of superiority of human performance have 
no unique meaning. Without this unique meaning, all scaled performance, whether 
in comparison with other test-makers, or in comparison with a perfect- 
performance, is not appropriately described by a ratio scale. And, thus, it is 
not capable of being measured. ' 

,u 'categorization of performance - 

In our empirical investigations of the tasks of measuring extensive 
physical magnitudes, we have found subtasks on which people's performance is 
consistently either present Or absent and which permit us to replace metric 
measures of performance with categorization. This section, describes how we 
analyze tasks into such "all-or-nothing" subtasks and what happens when we 
cannot do so. 

Our model of measurement, which derives from the physical siences, 
identifies the following major steps in making measuremnts: 

(1) identifying the attribute of the object to be quantified, 



ERIC 



i 



SOB'TARTZ, vTKXLptf 

(2) Choosing a unit of appropriate si2e, 



(3) Carrying out the comparison of the object to the .chosen unit/ 

(4) judging a level- of precision appropriate to the context in which 
the measurement ■ is made; ■ ■ . / : ; - ■ 



(5) Reporting the results. 



We identified subtasks for measuring length, time, area, volume, and weight 
during an iterated process of theory formation, task analysis, and empirical 
trials with students and teachers in elementary schools. We used the five-step 
model of physical measurement to inform an initially rather unfocussed 
exploration of a given task such as length measurement, with students and their 
teachers until we began to notice parts of the task on which students performed 
either well or not at all. These -"parts" or subtasks ^er« progressively refined 
and gradually embodied in games and activities and some test "items that 
allowed the beginning of ordered performance data. The measurement model was 
continually invoked and ref ined to help us decide whether or *not our set of 
subtasks was relevant and comprehensive. - „, 

: A sufficient task analysis would yield ordered performance data for each 
subtask < fee data would cluster in "consistently present" and "consistently 
absent " categories, with few in the inconsistent category. We took the existence 
of this dichotomous categorization to be evidence of distinguishability of the 
given subtask. 

The observed' dichotomous categorization allowed us to dispense with the 
scorinq of performance: everyone's (or almost everyone's) performance could be 
categorized within the presen|§r absent category. Thus ordered performance 
collapsed into two-valued categomzation. ' , 



\ 



In summary, the process of arriving at a categorized description of human 
performance included developing an understanding and model of the task, 
increasingly focussed activities with students and their teachers, verification 
of dichotomous performance on subtasks, and categorization of . this dichotomous 
performance; the entire process repeated cyclically until successful. 

Or unsuccessful. For some skills we were unable' to identify subtask Jthat 
aave rise to "all-or-nothing" performance. In particular, the task of com^ting 
elapsed time intervals frustrated our attempts at analysis and categorization. 
It may be that we have not been sufficiently insightful or persistent. Or it 
may bes that for some tasks- the subtasks are so interrelated that performance on 
one subtask influences performance on another. Or finally, of course, this 
result may indicate a failure or region Of inapplicability of our method. 

One final remark is in order before closing this section and moving on. It 
is not possible to completely separate or unconfound the effects of the observer 
and the phenomenon being observed. We know this to be true in the physical 
sciences where, the assumption that' experiments may be repeated and that the 
^. nature of* the interaction between the observet and the system being observed is 
^^fenown and is small are plausible. In the course of observing human intellectual 
behavior -it seems to us that these assumptions are rather more questionable. 



9 



ERIC 



SCHWARTZ, TAYLOR & WILLIE 5 . 



£rem thari, 

III TEST DEVELOHiENT AMD USE 

the meaning of "validity. " . • * - - 

: TESTS VS. "REALITy" OR WHAT IS SOMETIMES CALLED "CONTENT VALIDITY. " - 

TPgts as close observation _ for some purpose of an individual's 

performance in natural settings. ; *, 

When tests are constructed to represent "reality," the adequacy of the 

SSuse there is no way to be sure that, validating tools are themselve^ v^jd^ 
fiSSSr we believe aat when tests are developed in the settings in "jjch they 
wm S'uISd when such development is the result of extensive observation of 
"real"^rforma^c1 as interpreted using a model of that perform^c^ when tests 

cJoSly^reSle performance on alternate ^ at j on f ^"^iKs to oServ?? 
f ppi ime conf idence that the test, examines the .skills one wishes to observe 



9 

ERIC 



feel some confidence that the test/examines^n^^^xxo^- ™— — . 

reasonably high. ^ . ' v 

INSTRUCTED VS. SELECTED RESPONSES OR WHAT MAY BE CALLED "RESPONSE VALIDITY." 

' lust as one must be concerned about the adequacy" of a test as- a 
rpnrP^ntation Sf "real" tasks, one must also feel confident that the ways in 
^h^oSe Respond to test questions represent the ways, that the undertake the 
«£2"*talk Sf concern with the responses of students had led us to reject 
^selected response ("multiple choice*) format for several reasons. First, 
instructed resmnses simulate rial performance more realisticaly than selected 
re^nSlfEioplelo not ordinarily choose among possible answers when measuring 
?eS or timef Second, constructed responses allow students latitude in the 
wlvfthev cS Perform, tasks, a latitude especially important m a pluralistic 
loieBr Third! instructed response can signal the presnce. or absence of 
Srf ormancf on several subtasks, which -often can be separated using evidence 
Som tof StaUedlesplnse. Fourth, constructed responses permit a diversity of 
■wS^im^ai teachers can refine" their, understanding of ^ Performanc^ and 
n^n-Derformance in order to make instructional decisions. In addition, 
^SucS^Sponses provide us with a stringent check of our understanding .of 



; SGHWfflmj; mam & watm ■ 6 . : f . 

the _ task: being tested arid of the pfeseh^ of task-analytic categories; We 
likelihood of students performing in predidtaB±e_Wc^s_is greatly reduced when 
^ebhstrtW3ted answers aire permitted. When our model of the task does account for 
the diversity of constructed answers, we can be more satisfied with our 
understanding of complex behavior. ' ■ 

THE USERS, OR WHAT MY BE CALLED "PRACTICAL VALIDITY. 11 

9 . People use tests for many reasons. Hie tests described in this paper were 

designed to permit teachers (and others with similar concerns) to describe 
individuals' performance "and errors in one domain, measuring. Students m^y. 
learn to make sucessful measurements as a result of experiential learning that 
;s does not compartmentalize tasks and subtasks, but the teachers 1 role as 
"trouble-shooter" in this learning process requires that they have som^ 
analytical ^approach to the performance of their students. ' Teachers need to 
identify those * students who show ''mastery" of the skills of measuring, and of 
^ equal importance, they need to be able .to characterize the q needs of those 
students who have not demonstrated "mastery. " 

"All-or-none" performance makes gome instructional decisions relatively 
easy. The measurement tests developed by Project TORQUE each prove the 
student! s ; skill .in only' one kind of .measurement. ; . On -each test/ regardless of 
the number of "items" (between 6 and 12) , the general criterion for mastery is 
one or no errors. A student who meets this general criterion has made/ at most, 
fme error on one subtask. For those students who do not meet this criterion, ' 
another look is necessary. This second look and the consequent categorization 
of the errors, is a rich source of 4 useful diagnostic information. In some 
cases, the .error analysis may reveal that toe absence of performance on only one 
; subtask is the source of several errors. In other cases, the error analysis may 
reveal that all the measuring subtasks have been mastered, but that errors havd 
been made, in related tasks such as counting or calculating. In still other 
cased, th)s second look may prove insufficient, «and a third look with an 
alternate version of the test or with games and activities like those used 
during test validation, may be necessary before a teacher can* decide on a 
student 1 s laming needs. (To facilitate this process for: teachers, we /provide 
than with information- Wje have gathered during our research and development work.* 
A clear description of the - subtasks and common errors of measuring is written 
in a teachers' manual. A list of the categories of common errors and a partial 
list of comnonly occurring wrong answers which signal those errors, is printed 
directly onto a teacher's carbon copy of each student's test.) 

The descriptive power of tests which are based on "all-or-none" categories 
may have significant instructional results* In preliminary field trials of, the 
' TO^UE measurement tests, teachers have been able to identify specific learning 
needs and to focus instruction 'on them because they have been able to observe 
their students' performance and to interpret that performance in terms of a 
theoretically derived and empirically verified model of the task. 



• • . 126 

ERIC : 



SCHWARTZ, TA^OR ; & -WILLIE 



IV DETAILS OF THE DEVELQIMENT PROCESS 



This, section gives a detailed account of our process of test deveopment, 
using as an example one test of the measurement of area. •■: ; . .« 



After observing students, reviewing current curri 




er|c 



extensive discussions in our staff and with classroom teachers, we desic 
names and activities that permitted us to observe performance on the tasks of 
area measurement according to our general model of measurement, Students then 
used these games and activities in their classrooms. Although teachers and 
students were enthusiastic about the games, teacher observations not 
identify subtasks on which dichotomous performance could be observed. Start 
members then worked intensively with small groups of children, using a variety 
of trial materials, until we had focussed progressively on subtasks on which 
performance seemed to meet our criterion of dichotomy. 

We found that for area the identification of the attribute, the first step 
in our model of making measurements, was a difficult task for many students, and 
that we could describe subtasks relate to this step. As a result, we decided 
to design- two tests of area measure: the first test focusses on attribute 
identification by asking students to measure areas by "covering" regions with a 
nonstandard "tile" -unit; the second test deals with the • measurement of area by 
computation using measured lengths. In this section, we trace the development 
of the first of these tests. , ... 

' our accumulated experience with teachers and students revealed twb major 
subtasks of- identifying the attribute of area : 

(1) DISTINGUISHING BETWEEN LENGTH 'AND AREA. The most common error 
students make is to use area units, in our case "tiles," as units of 
length rather than area. (We found this to be true even when area 

V units were triangles "or hexagons. The longest length of the area unit 
was used—as a length unit.) When presented with rectangular and.- 
irreqjlar regions,, some students measure one length, others add two 
perpendicular distances, others measure the distance around the edge 
of the shape, the perimeter. 4 

(2) DISTINGUISHING BETWEEN SHAPE AND AREA.' Many students do not 
distinguish between area and the shape in which it occurs. When we 
presented regions that could be "covered" only by using half- tiles as 
well as whole-tiles, some students ignored those parts of .the region 

. wn ich could not be covered by whole- tiles, Other, students counteo 
every half-tile as a whole unit, and still others used overlapping 
• tiles, each counted separately, to avoid partial units altogether. - 

Performance on these two subtasks could be observed as students "covered" a 
variety of regions, some of which could be covered by whole-tiles and some of 
which reouired both whole-tiles and half-tiles.. Length confusion was observably 
in both' cases, while shape confusion was observable only, m the latter., we 
labelled the subtasks "whole units" and "half units, ".-for convenience. There 
are peripheral tasks, such as counting, adding, and. familiarity with fractions, 
whicS^#iew would affect performance, but we believed we could separate . 
plfLlc# on .these peripheral, tasks from performance ,qp. the /attribute 
subtasks. : ; ./ - ' '• ' / • , - • • \ ^ 



i ■ : 'i several pcelsninary versions . of air ^i&Ma^affi^^tt"^^ a~ wide, y&ciet^ 
t>f validating' faames and activities ^ 

We flight a set of mater iali ; tha't wouM demonstrate siinul taneousiy , ( 1 j that bur ;? 
analysis j?f . tfce subtasks was stiff ^iently * correct and^ 

performed either all-or-none on a given subtask f .(2) that *a significant fraction ;> 




ERLC 



that test performance ' /on each subtask was consistent witlr perform^ice . a 
validating activities. 

One .validating activity which - evolved from \ this procedure is-ihdwn in 
reduced form in Pigtire 1. It carried the English title, "Lots of Land, R and- the. . 
corresponding Smanish name f "Ranchos Anchos." Students considered -it a map, of: 
plots of land which they could purchase by measuring the area ;; of each- plot in 
tile units. Starting with any plot of land along a short side, the player Would . 
move from one lot to an adjacent lot, measuring the area of each one, until a - 
connected path crossed the board. Players were asked to . pass through a. "free" •' 
lot in the middle of the board in order to insure* that every player would 
measure a sufficient number of each kind of area chosen according to the 
subtasks listed above. ,> - '* . v . 

Figure 3 shows one form of the test that was validated against the "Ranchos 
Anchos" activity. "The apparent* simplicity of this test, is- somewhat deceptive: 
every graphic feature and characteristic of each item * is the result .of much 
close observation and many discussions. Behind this -particular version of the 
test is a set of rules for generating each item in- multiple versions. 

The six items ( 6f the Area Measurement Test (tile units >, have this following 
characteristics: ' ■; s 

Items a and b areQrectangles ' which can be cpyetfedf by whole 1 '"tile" 
units and which contMn interior cells. %5x ; . - 

Items- c \ ahd d are irregular, contain interior cells; aiid r can . be 
• " 'covered by whole "tile"* units. \ '. " v>: ~''' : ?"- •7; A-^Sk 

Items g f h f i arid j are irregular, contain no interior "tells and xnu^t 
■-■ be covered in part by half-units. _ The regions in' itefhsr (g| and (h); 
which must be covered with half-units are more easily partitioned; than 
the half-unit regions in items ft (i) -and- (j) . For items (g) ^nd ;tfi)- the, 
half-units can appear as tabs/ with three, aides ex^se}3 on the 'contour 
■ ■■• ■ ; ; ; pf ^the sj^jze.* Fojc t^ms (i) and j j) the half-units ate embedded, in , 
""•'the shape i \, . ' ' ■ ' . \ ' \ 

IWb versions - of the test were used in the validating procedure, described 
here. '- ' . ' . ■ .";./;/■ ............ :^}, 

The validation process itself took place in various school .jsyst ems in which 
we could visit" -classrooms with children frpm diverse ethnic, , cultural, 
linguistic, and socioeconomic backgrounds . A typical validation session went as' 
follows: two to four staff members wbuld ^e^p|n_ a classrcm'at midmorning 
with a box of materials. Each staff mohber ; ( rather than the teacher) would 
select two children at random 1 irid sitr ddwn Vith them at a table to one side of 
t ;tfie' • on-gping- classrQOTi , activitieC'^ © would explain to -the 

children that We were making up te^ts and needed their help to discover whether 



. SCHWARTZ, .TAYLOR & WILLIE ' 9 

the tests Sell' good enough. We told them that we would be taking notes on 
observation Sheets while they did' the test, and played some gamesy ; aid that we 
might ask them some questions as they went along in order, to understand their 
thiMing. 'When the sessioisgas over ,' we would answer all their questions and 
talk with them about what.werlnd they had learned. .... V,-.- 

t^p rhildren would first" take one version of' the test as pretest. During 
this and ' the entire validation - procedure the observer would watch the 
measurements being made, take notes, and ask for explanations of strategies that 
the children- were using. After the test came the validation activity, in this 
case a one-person game, although for tests of some other measurement skills the 
qames were ' group games. Following the game, each child took an alternate 
version' of the test as posttest. (Our terms "pretest" andj'posttest refer to 
their position as first and iast in the validation procedure. This use of the 

.; terms is" not to be confused with conventional uses in which explicit instruction 

. . takes place between -the two tests. ) 

* After the formal -validation ; procedure was completed,: we; welcomed the 

• children 1 s comments and criticisms. We refrained- from making judgments about 
•'■ individual student performance, but we encouraged discussion about the questions 

and the activities.' -When "'children, showed specific interest; -we spent some time 

• 'teaching them about the measurement skiU that was ^ the subject of the validation 
activity v." We showed the teachers copies of the tests and validation activities 
but did not discuss with them the performance of individual" children. 

- These vaidation sessions typically lasted about an hour each. ' .' 

. Nov/ began the wor k vi oF" interpreting the observations. Each observation 
«sheet, along with the 'completed written tests and game sheets, carried as full 

' an account as we could manage of the behavior observed. We organized this 
account under the subtask to which we wished to pay particular attention. The 

' nre-conaition of validation was • "all-or-nothing" performances by a large, 
majority of children on- each subtask in both tests and validation activity.. The 
criterion of - validation was the consistency of performance . on tests ,. with 
performance orv the validation activity. In the fallowing sections we report on 
the application of this precondition and criterion to a variety of tests. 



V .. • DETAILS OF THE" VALIDATION . PROCESS " ' :• \. • ^ - ' ''[}::, ^MW0I: 

OuifaltalysisV of each student '^'.performance- on each, subtask began with our 
interpretation of each constructed response, interpretation made reliable by our 
'observations, and notes. We made <f decision of "correct" or "incorrect for each 
• subtask. included -in each response., - r A .^performance fraction" (number of correct 
" '' respons's/SujSj- of opportunities to respond) was used to describe the ' subtask 
peref ormancfeior each student 'on pretests, validating activities, and posttests. 
"Performance%fectons were plotted on. linear scales, as shown in. Figure. 4. When 
most perfdrma%e fractions fell' near they "top" or the "bottom"^ of the ordered 
' scale, T we had Satisfied our" precondition that correctly identified subtasks give 
rise to dichotomous- performance.' Table 1 shows, for- each -subtask oh the . series 
of measurement tests we designed, and for both methods (tests and validating 
activities.) , the' percentage of students whose performance fractions on subtasks 
fell within the "top" and "bottom" boundaries, thus satisfying the precondition 
of dichbtbmous performance. : i ; ' 

' ■ :5 .- Defining the boundaries of "top" and "bottom" performance is central to our 



9 

ERLC 



129 



SQfrTARTZ , TAYLOR & VUMjJE . 10 .. : ' ^ 

i*bde§s b£ Vaii<&TiQn^'-J%: defining H t6p, " iT^fe^ «JSjOTe" regions on the a 
linear scaie^ / we were : able, .;t§ .. categorize^! tte : perf omaije^ we observed: 
consistently correct*^ incorrect ;,"ar^a : inconsistent^ Hbw_ High is. 

' ! f tb^ f ^nnahce? We •• examined this 

question ahcl detemined that^ in practice/ the location bf the boundaries. do not 
matter much; there' are about as-many^^ 
^j^e^nefe^^ region 75% -to = 

• ^.lCb%:;^if bmartcev ; ; ^ttttcw? " as the region 0% to !25% f :. ^d •'middle" as tfce; wider; i: 
:^■•fcegi■^^f - -;55%-.-i- ." : tO- 75% * Figure* 5 • ^cto& these ■ ^^p^.^!^-:tRe v example data of 

: vFiaiire 4 . . f VL-^-yr'*^ - : - v.- ; • ' • •• 

activities, the 

i±ree perf crmance fraction^ they will fall 

within^the. same per Ecrmahce - cgfcegbr i es . ; .We hb^d to validate each subskill by 
demonstrating that individual: students did perform consistently across tests and. 
games. It is clear f ran both Figures 4 and; 5 that the performance of ^ child 
#9007 oh pretest; game, and , posttest meets neitiier the- precondition b£ 
dichotomous performance to performance -that 'Would 

tend to validate the test for this subskill'. ;: 

VALIDATION CATFpORIES >. ' ■ ■■. ■■ th(i^r \ . ''~ './-,, r 

Stated in * terms of ouj: categories of : performance: d^ above, a 

validation case consists . of a triple of performances on pretest r validation 
activity,; and posttest all three of which lie, within a single region: "top" or 
"middle 11 or "bottom. 11 In this section, we examine possible results in which at 
least one of the triplet of performances lies in a region different frofn the 

* other two. - These results are not validating. 

' What triplet of "performance; will tend to invalidate the test for a 
particular • child and subtask? Generally there are two distinguishable classes 
of performance profiles that we classify as invalidating. In the first brie the 
performance is in a- higher region on both the pretest and the posttest than it 
is ori the validating activity. There are five such profiles; shewn in'Figure 6. 

Assuming that* the validating activity correctly represents "real measurement," 
the -test would yielt? a faTse positive for these children on these « subtasks. 
Because thse prof iles have the general ^ shape of a Romari vee, we call them 
. "Invalid .Vee." . :-'-; : \ .;/ ■ . : : }r ---'-.j :\ ... 

The other class " of profiles which we consider invalidating are those iri : 
' which performance on the pretest and the posttest are both in a lower 
performances region than on the Validating activity ^ All five such prof lies are 
■ shown iii * Figure 7 . 'Because th'ese have the general shape of a capital Greek 
v lambda> we >call these cases "Invalid Lambda. " ;^ 

in categorizing . v&idatibri results, we have dealt so f ar ^ith three 
v^idating profiles^ (validating top^: validating middle, and. validating bqttqmK 
and ten invalidating prof iles (five invalidating vees and five invalidating 
'lambdas) . in a world specifically constructed to mak life easy, for test- 
maker &s, these would be the only categories that -exist. Unfortunately, in terms 
of our performance -triples* there are fourte^othOT possible cases. These 
fiourteen cases divide raturally into ^ There are some 

cildren whose performance gene rally iitipr:wes luring rthe validation procedure. 
All such prof iles are shbwii.iri Figure 8. Alternatively, the performance of a 
few cildren generally decline during validation procedure. All such profiles 

:;• - - ■ ' v 13.0 ^.4:&' 



• SCHWARTZ, TMLOR & WILLIE 11 \ 

are shewn in Figure 9. 

What, significance do these fourteen ^prof iles have for our decision, about » 
the validity of a test? A connibh feature of ^1 fourteen of these profiles is 
that performance on the posttest is different from performance on the pretests 
This, raises two primary possibil ities; either the pr etest and the posttest are 
not equivalent or "sqmetning Happened" during the validation process to change; " 
-perfomanc^^- Jn— -^e<±-ion — we— examine^the — question — oldest— -equivalence .— — — 
j Because we were watching carefully and in detail while children performed the 
pretest, .validating activity, and. posttest, we were in many cases _able to 
• document "what happened" between pretest and posttest. First of all, of course, 
children learn f ran the activities themselves or f ran other children, thus 
improving their performance from pretest to" posttest. Sometimes they are 
influenced by other students: in Qie v^i^tion setting to change in midstream 
from a correct to an incorrect strategy, so that their* performance actually 
declines f ran pretest to pdsttest. , Because the validation process went on for 
an hour, fatigue is also a factor.. Because validation took place alongside " - 
; regular classroom activity, distraction is unavoidable. Finally, there is an 
irreducable inconsistency of performance that occurs, particularly in the 
absence of feedback, as children who are not sure about how to do something try 
seve|^l different strategies... - • ■' . ■ 

The fourteen profiles just described carry an ambiguous message about 1± 
validity of tests, particularly because th#performance on the tests themselves 
is inconsistent f ran pretest to posttest. Although we can, by other means, 
demonstrate the equivalence/of the tests themselves, for most children there is 
no , way Q to distinguish between simple instability of performance and the 
influences on performance ' of the test 0 validating procedures. We call these 
cases "neutral"; those shewn in Figure 8, which are generally rising, we call 
"neutral up," while those in Figure 9, which ^are generally decreasing, we call 7 
"neutral down." 

THE VALIDATION CUBE ' ' 

We have examined twenty-seven possible performance profiles that categorize 
validation results.* Each profile consists of a triplet of categories: top, - 
middle, or bottom for each of the performances on the pretest, validation 
activity,, and posttest. Each could, ttierefore^be represented by a triplet such 
as (B,M,T) which, for example, would mean bottom performance on the pretest, 
middle performance on the validating activity, and top performance on the 
posttest. All 27 triplets* can be represented by a 27-celled cube, as shewn in 
. Figure 10, where we have presented performance categories on the three 
validating- steps according to the conventional right-handed x, y, z coordinate 
system. "Bottom" performance is placed nearest to the origin of each axis. 
Because we are classifying rather than quantifying, the "middle" regibn is 
depicted with t±e S^e dimensions as the other, two. : V ' 

We call this display of validation results the "validation cube. " . 

Each of the twenty- seven . cells in the validation cube corresponds to a * 
single profile described in the previous section. For example, the performance 
of chila_#9007 Shown in Figure 5 would be classified in the cell° labeled "A"" in 
Figure 10 because the child performed at the bottom of the pretest, at the 
middle "on the game, - at the bottom on the posttest. 

ERIC .; { : 



SCHWARTZ, TAYLOR & WILLIE • 12 



she validation cube can be exploded as in Figure 11 to show the sets of 
boxes corresponding to the validating, invalidating, and neutral profiles. 

Since it is difficult to visualize the validation cube in three dimensions, 
^ we slice it for presentation on a page as shown in Figures 12 and 13. In .the 
latter figure, the {capital letters V, I, N, stand for the prof lies which ten/3 to 
validate, invalidate, and are neutral respectively. There are only three boxes 




The initial area test, described in Section IV above, was validatedtoith 52 
children who can be described in the following ways: 29 were male and 23<were 
female; 12 were Black, 16 were Hispanic, 22 were White, and 2 were other ; 2 
were 7 years old, 16 were 8 years old, 19 were 9 years old, 10 were 10 years 
old, 2 were 11 years old, 2 were 12 years old, and 1 was 13 years old. The 
validation results are shown in Figure 14 for the two subskills described 
earlier as "whole units"- and "half units. " What do these results say about the 
validity of the test? We feel they constitute a' prima facie case that the test 
is valid for two subtasks of identifying the attribute, using the overall 
criteria: a large fraction of cases shown dichotomous and consistent 
performance. „ 1 

IV TECHNICAL CaiSIDERATIONS 

We discuss in this section some technical considerations that cannot be 
avoided if one wishes to close the loopholes on the prjma facie case that our 
procedure can produce tests valid for assessing performance on subtasks: 

- Are the different versions of the same test equivalent? 
.- What is an adequate sample size for validating a test? 

. _ what consti tutes the "presence" or "absence" of a subtask? 

" l - how high is "high" performance and how low is "low"? 

- Do validation results provide information about the distinguishabil 
ity and relative difficulty of subtasks? 

Before taking up these questions, we need to discuss one significant detail 
of low performance on the validation procedure. Because we worked in a wide 
variety of classrooms, regardless of whether or not instruction had occurred in 
the topics we were testing, we needed to be sensitive to the students' reactions- 
to our requests for performance on skills they may not have known. We were 
uncomfortable asking students to work for an hour on something - they could not 
do. 

The procedure we adopted was as follows: we encouraged all children to try 
for as long as they could, s When a student said he or she could not do a task, 
we explained the examples on the test as clearly as we could without teaching 
and then asked them to look at the test items.: If at that point they said they 
could hot do it and the staff member felt confident "that this was the case, we 
stopped. . For example, there were 7-year-olds who told us they could not tell 
time except-for the "o' clocks and the thirties." Children stopped at various 
points during the validating procedures. ; 



SCHWARTZ TAYLOR & WILLIE ' V • 13' :"- , . 

■■ _ ■ _ . , . _ ■ l , .__ ^ ._■ . j». >. .. . 

Our policy was that if a child .could not do a_subtask> we classified it as 
a valid bottom ' performance. 'This may; be criticized as _ii$^Bing [ in the 
validation results i diildren who did. not complete the/entire validation process. * 
However, we observed that these children : could not, and they said they could; 
not*, carry through this procedure. AIL available evidence pointed to consistent 
performance at a low level. It was not feasible for all tests to locate a large 
number of children who could not carry out the tasks and were^ willing to spend 

an hour attempting things they could not do.. - -.— — > — 

' i 77~- , •. 1 : '. , . . ••. * ~ : — \ 77' , 

For each sbtask • on the '.validation charts in Section V, the number o£ , 
children with whom we had to deal in this way is indicated by the phrase 
"aborted valid bottoms. " ( i 

TEST EQUIVALENCE . \ 

One product .of our test development method is a set of rules for generating 
each item in alternative versions. Typically .we produced alternative versions 
of each test for the validation process. The validation itself depended 
crucially on the equivalence of these versions, since its major criterion was 
consistent performance across similar tasks. The availability of equivalent 
forms of each test makes pretesting and posttesting possible during validation „ 
and secrecy unnecessary in later use. But we do need to demonstrate that the 
decisions made about a student 1 s performance will not depend on the form of the 
test administered. ' ^ 

■ " '■ ' "' '■ ' _ v , * 

We demonstrated equivalence by giving pairs, of tests to a group of students 
on the same day and comparing the number of errors oh each subtask. Parallel 
forms should. yield consistent performance for each student for gach subtask. . 
During the development phase, inconsistencies helped us to pinpoint individual 
items that needed revision*. By repeatedly revising our items in response to 
classroom results, we were able to achieve a high consistency of performance for 
every subtask on parallel forms of each test. 

What is a criterion for "consistent performance"? On each test there -were 
v between 'two and nine opportunities to demonstrate each subtask, with three and 
\, four opportunities dominating. For those subtasks with ..only two opportunities, • 
; we judged equivalence according t o whether or not there was an equal number of 
errors on the first test given compared with the second test given for that 
subtask. For more than two 1 opportunities, exactly equal numbers of errors for 
each subtask on each test seemed an unreasonably stringent criterion. For these ' 
cases we judged equivalence according to whether or ' not the number of errors on 
the first test given differed by no more than one from the number of errors on 
the second test given for that subtask. : " V ^ 1 l. ^,-^ 

- Figure 21 shows, by example, how we display equivalency data for the "whole 
units" subtask of the initial area test. There are four opportunities to - 
demonstrate this subskill on each test. A total of twenty students from the 
third and fourth grade took the A and B versions of this test. For some,, the ^ 
first test was version A; for others the first test was version B. The number 
of errors on the first test taken are plotted on the horizontal axis, and the . 
number of errdrs on the second test taken are plotted on the vertical axis of 
•• Figure 21. " The number in the cells are the xptai numbers of students whose 
• ..* ' performance fell in that region. The band outlined boldly shows the boundaries 
r 6f our criterion for equivalent performance. (An important characteristic of 
this test of equivalence is the range of performance frm 0 errors to 4 errors on 




SCTffiKIS; TAYLOR &* ^illie ; 14 f ■ '\ ■;. ... .[ 

each test;) ; * 

Table i/shdws equivalency results expressed as a. percentage for the ,; 
subtasks on all the. tests. Percentage of equivalent performance is equal to J # 
of eases of equivalent perf ormance)/! (# of cases of equiyalenU perfomance)+{# 
* of cases of non-equivalent performance) ] xl00% . :/ . v 

, ADEDOftey^F L -Sj^IP£^ 

There are several approaches to selecting sample size for studies of * 
. people. At one extreme is" the case study method r where close attention is paid 
to individuals/ and conclusions are drawn on 'the basis of small numbers of 
cases. :At 'the other extreme is the statistical analysis of data, from large 
- ^ - V numbers 'of £®bple. Our method lies between these extremes,* with at least 40 
students palleipating, in the validation of each test. The maximum validation 
sample was 79. : , 

^ i - v ; . _ . " _■ ■'__•„•:; - -' 

We are trying to make a prima facie case for the validity of our tests based 
on the "validation aibe*' displays presented in Section V. We f eel tha a severe 
test for /the adequacy of the sample size is to cut this norfcer in half* using 
random selection, and see if the reduced sample still implies validity. Figure 
; 22 shows the results of such a procedure for the two subtasks of the initial 
area test. The "uncut" data were presented above in Figure 14. For comparison, 
the half-sample results have been multiplied by 2 and entered in each cell ir? 
parentheses in Figure 22. Our feeling is that analysis of the subset would 
provide as powerful a case for the prima facie validity of this test as did the - 
original full sample" size. 

We have carried, out the above procedure for every sub£ask of ^ every 
measurement . test. • It is cumbersome to show .all of these secondary validation 
cubes. As a rough meafeure of the confidence in validity, we have defined a 
Validating percentage" as. the fraction (# of validating cases) /[(# of 
validating cases) + (# of invalidating eases)] converted to a percentage. Table 
2 compares the validating percentages for the full sample for each of the 23 
s subtasks on our tests with the validating percentages for the "half-samples. 11 
We feel these results justify the conclusion that the sample size we have chosen 
is sufficiently large to demonstrate validity, again ; with the exception of the 
weight test. 

DISTINGUISHABILITY AND RELATIVE DIFFICDLTY OF SUBTASKS 

The subtasks for which the final versions of our tests are validated are 
77T — sel^Tel3T^~a^ 

and are refined so that most children "perform either "very high" or "very low, 1 ! 
with few in the middle for that sUbtask. Our measurement model is task-orientefd 
and does not incorporate a theory that accounts for differences in perf ormarice . 
on subtasks. However, the validation results can be used to provide evidence 
about th£ distinguishability between subtasks and" their relative difficulty. If 
all children performed equaly well on all subtasks, oneL might worry about 
whether these subtasks had been adequately distinguished from one another and 
whether independent -subtasks do, in fact, exist. • " /. " 

Frcm bur validation data we defined a performance percentage for each 
" subtask as the fraction (i of valid top cases) /(total # of -valid cases) 




SCHWARTZ, TAYLOR & WILLIE ; 15 



converted to a^perdentage. The results for each sUbtask are shown on Table 3, 
Sie differences on performance percentages between different subtasks on each 
test provide evidence for distinguishability. between subtasks, and seem to 
conf i riti cetooh-senge notions about relative difficulty of these subtasks, : 

CRITERIA P3R HH3H AND £GW -FERFCM^NCE v ^ 

. * - Otir val idatibn depended upon categorizing performance as "t op, " "middle," 
or "bottom. " It was important to our results that-most — performances fell in 
either the "top" or "bottom" categories. How much are our results affected by- 
the location of the boundaries Which we place on "high" and "low" performance? 

We need to test the sensitivity of the number of validating cases to the 
location of the boundaries on our performance categories. We tested this 
sensitivity by analyzing the sane data with three sets of botmdariesi Siese 
boundaries are shown in Figure 2. The results for 71 children who took the 
extended length measurement test are shown in Table 4. 



The first two internal divisions, in which "top" and "bottom" categories -- 
*are either one-fifth or one-r^gurth of the region, satisfy our criterion that the 
"middle" region be the widest. As shown 'in Table 4, the number of validating 
cases appears to, be - insensitive to ffiese two locations of the internal 
boundaries of performance categories. Even in the _» radical test . of sensitivity 
that violates our stipulation about the size of the "middle" region, the number 
of validating cases was changed significantly for only one subtask. . - 

The low sensitivity of validation r^ults to the position of the internal 
boundaries means that, the location of these boundaries may be chosen somewhat 
arbitrarily. We set the boundaries one quarter of the way from the top and the 
bottom. This choice yields a middle region twice -as wide as £he regions at the 
top and bottom. : . : : ^ 



Our decision about the location of the internal boundaries that determine 
"top" and "bottom" performance influenced the number of opportunities we had to . 
include for each subtask on pretest, validation activities, and -posttest. We 
set the minimum number of opportunities at four. This makes it . possible for ah 
single error to "still b& called "top" performance, because it falls <pn the upper 
Internal boundary* + 

In analyzing validation data, performance was judged consistent iffiong * 
pretest,; validation activities, and posttest if all three points lay within a 
single region. The internal ^boundaries were considered parts, of both adjac^ht 
regions. For example^ a performance percentage triplet 75%, 100%, 75%, Was 
considered to be "valid top" whereas a triplet 75%, 50%, 75% was considered to 
be a "v^lid middle." ■/• ' ^ : / ? 

CONCLUSION ; '> ■ ' • . '''"['' V • 

In summary, our categorical validation method can be outlined ir\ four steps 
through which one cycles until success or failure is manifest : - ~ 

1. Develop a model for the task being probed; . L • \ ' 

2. Use the model to analyze the task into subtasks; 



ERLC 



SCHWARTZ* TAYLOR S WILLIE 



16 



Use .games ' an£ other, "validation activities" to determine 
.rmance on subta&ks is "all-ror-noneyr ' . 



-3. . 

performance 

A, » Develop tests, and : validate them by shewing consistency of "all- 
or^none" perf ormance f validating activities for each 



,-„ c ,. ^v^^-l.^, w — results in tests that characterize and help 

to diagnose performance without applying a numerical scale to individuals. 

Khen unsuccessful, the procedure can reveal inadequate understanding by the ! 
test developer of the task being probed or the appropriate decomposition into 
. Repeatedly the procedure has helped to correct our analysis of 
measurement and the ways in which students carry it out. 



.': Lack of success in the procedure ..can also imply a limitation in the 
procedure itself. Human performance is complex and we are accustomed to having 
nature,. especially human nature, escape the models we devise to describe it. We 
hope that this procedure can be adapted to apply to a range of tasks that are 
important in schools and useful for children. 



t ;• • 



ERIC 



136 



ERIC 



ERIC 



SCHWARTZ, TAYLOR & WILLIE ... 1? 

XI BIBLIOGRAPHY 

tfllcl*, -M. O, "Griteribn-ReferenGed :.M^r«i^....^ ^f-.S^h Tepns,- ?n 
PROBLEMS IN CRITERION-REFERENCED MEASUREMENT, C. W. Harris, M. C. Alkm, 
' and W. J. Popham (eds.) . CSE monograph series in evaluation. No. 3. Los 
Angeles: Center for the Study of Evaluation, University of California, 

1974.. . . • ; ; " : v< 

Baker, Robert L., "Measurement Considerations in Instructional &°i e <*- 
Development." In PROBLEMS IN CRITERION-REFERENCED MEASUREMENT, Chester W. 
* • ■ HarrisV Marvin C. Alkin, and W.- James Popham (eds.), CSE mbnograph series in 
V evaluation, No. 3. Lbs Angeles: Center for the Study of Evaluation, 
, . .University of California; 1974. 

Brown, John Seely and Burton, .Richard,^., "Diagnostic fe^f^f^»o^i« Bugs 
in Basic Mathematical Skills." COGNITIVE SCIENCE, 2, 155-192 (1978). 

Campbell, Donald T. and Fiske, Donald W., "Convergent and Discriminant 
Validation by the Multitrait-Mul,tLethod Matrix. 1 PSYCHOLOGICAL BULLETIN, . 

1959, 56, 81-105. :/':' r 'f':l 

Hambleton, R. R. , Swamihathan H., Algina J. and Douglas B. C. 
"Criterion-Referenced Testing and Measurement: A Review of Technical Issues 
and Developments. " Review of Educational Research, Winter 1978, Vol. 48, 

• •.. No. 1, pp. 1-47. ' - 4: . : ■ ,r ■ ' ' ; ' , A ' • • ■ ; ,/ 

Harris, Chester W., "Problems' of Objectives-Based Measurement. " In PROBLEMS IN 
CRITERION-REFERENCED MEASUREMENT, Chester W. Harris, Marvin C. Alkin, and 
W. James Popham, (eds.) . CSE monograph, series in ev^uation, No.,3. Los 
Angeles: Center for? the Study of Evaluation, University of California, 

' . . ' 1974. . . : V : 

Hoffman, Banesh, THE TYRANNY OF TESTING. New" York:.. .Collier Books, 1367. (soft 

cover; 223 pages) . </. : , ' ' .. "*fl ' ^jr**.'.. 

pouts, Paul L., THE MYTH OF MEASURABILITY. New York: Hart Publishing Compa)^^ 

■ mc, 1977. / : 'f .-;;'4f'. .-f . : ■ &C 

Keesling, J. Ward, "Empiricai;validation of -Criterion-Referenced Measures." in" . 
' PROBLEMS IN CRITERION-REFERENCED MEASUREMENT, Chester W. Hams, Marvin C. 
Alkin, and W. Jamesf?pppi, ' (eds,) .. V ?CSE monograph series in evaluation. 
So. J' Los ;Angel^^|e| for the" Study-of Evaluation, University of 



California, 1974. - 



x- 



Nitko- Anthonv J. , "Problems' in the Development of Criterion-Referenced Tests: 
Thf iS Pittsburgh' t Experience." In i PROBLEMS IN CRITERION-REFFJXENCED 
MEASUREMENT, .Chester ' W. Harris, Marvin C. Alkin, and W. James -Popham 
(eds . ) . CsE monograph K series;, in evaluation, No. 3.. ..Los Angeles : center 
for the Study 5 of Evaluation", Universit^Tof California, 1974. 



Popham. W$ -James, - "Selecting Objectives and- Generating Test Items for 
^bjectiveSBased 'Tests.'* In PROBLEMS IN CRITERION-REFERENCED MEASUREMENT, 
Chester^-i Hafris,: Marvin ,C. Alkin, and W. James Popham (eds.). CSE 
f • monograph series iii Valuation,. No. 3. Los Angeles: Center for the Study 
.of . •Evaluation, UniverMty of California, 1974. , . 



SCHWARTZ, TAYLOR & WILLIE ;.. 18 

Popham, ' ; W. James, CRITERION-REFERENCED MEASUREMENT, Englewood, New jersey: 
Prentice-Hall, Inc. ,1978. . 

■ V ; .. J- ■;■ & ■■ : . ;?-vr. 

Schwartz, Judah L. and Taylor, Edwin F. , "Valid Assessment of Complex Behavior; 
The TORQUE Approach. " THE QUARTERLY NEWSLETTER OF THE INSTITUTE * FOR 
. „ COMPARATIVE HUMAN DEVELOPMENT, July 1978, -Volume 2, No. 3. . u . 

• _ j ' _ __ __' ' - ■_ '__ ' j ■_ * L ' . _ » ' _ 

Skag^r, Rodney W. , "Generating Criterion-Referenced Tests from Objectives-Based 
• Assessment ' Systems : Unslved Problems in ' Test Development, Assembly f and 
-Interpretation. " In PROBLEMS IN CRITERION-REFERENCED MEASUREMENT. CSE 
: . ihonogra^ series, iri evaluation, _Nb. 3 i Los Angeles: Center for the Study 
of Evaluation, 'University of California, 1974. * .v. •„ 

Wilson, H. A. , "A Judgmental" Approach to CriteriorHRef erenced Testing. " In 
■ PROBLEMS IN CRITERION-REFERENCED > MEASUREMENT. CSE monograph, series, in 
evaluation, No. 3. Los Angeles: , Center for the Study of;' Evaluation, 
< University of California, 1974. .'.'*'• 

Zacharias, Jerrold R. , "The. Trouble wife IQ ' Tests* "W lit : THE MYTH OF 
MEASURABILITY, Paul L. Houts (ed.) - : New York* Hart -Publishing Company, 
1977.. •; •• . ' 5 . js& 



SCHWARTZ, TAXLOR.& WILLIE 19 



i APPENDIX: .. 
/SUBTASKS FOR MEASUREMENT TESTS . : 

; TIME-TELLING TEST . 

This test focuses on scale reading: the task of reading a traditional 
clock face and, reporting the time in any' of the conventional written or oral 
notation systems. Students are not asked to measure tune intervals. The test 
has been validated for the following subtasks: 

* ' • 1 . .Reporting the minute scale for the 1/2 hour, 1/4 hour, and 3/4 

. . hour positions. . . ; 

' *2. Reporting the "minute scale for the 5 minute positions. 

3. Reporting the minute scale for the 1 minute positions* 

4. Reporting the hour scale, even when the hour hand is between two 
numbers. . y — ^ . 

5. Reporting the correct relationship. between the minxes and the ' 
preceding or approaching hour. ' > 

INITIAL LENGTH TEST 

This test focuses * on scale reading by presenting ^gths to be measured 
with a ten-centimeter ruler which is calibrated to .5cm. This ruler has a blank 
tib one centimeter in length before the zero point and a blank tab .two 
Hhtimeters in length after the ten centimeter point. . 

This test has been validated for the following subtasks: 

1. Choosing a correct starting point: Placing the rule* correctly 
along the line to be measured. • 

2 . Measuring lines of integer length which are shorter than the 
ruler, such as 7 cm. . 

3. Measuring lines of non-integer length which are shorter than the 
ruler, such as 7-1/2 cm. *^ 

4. Measuring lines of integer length which are longer than the ruler 
(between 11 cm and 19 cm.)- > 

5. Identifying the "longest" or "shortest" side of a trapezoid and 
measuring its length. ' . , 



ERLC 



SGHf7flRTZ r TAYLfeR S WILLIE 20 

'. ;' ■ B^JDEB LSiS*H TEST 

This test focuses on scale reading and judging appropriate precision^ 
Students measure lines with a ten-centimeter tab ruler calibrated to ;i cnu ■ ?; 

The subtasks are: 

1. Choosing a correct starting point. Placing the ruler correctly 
along the line to be measured. 

2. Measuring lines of integer length which are shorter than the* 
ruler, such as 7 cm., 

3. Measuring lines of non-integer length which are shorter than the 
ruler, such as 7.3 cm. . 

4. Measuring lines of integer length which are longer than the ruler 
(between 11 an and 19 cm) . 

5 . Identifying the "longest " and "shortest" sides of * a triangle and 
. the "length" r of a pencil and measuring them. 



INITIAL AREA TEST 

This' test, described in /detail in the text of the paper, assesses 
performance on the tesk of identifying the attribute of area. Hie test helps 
teachers know whether or not "a i^udent can distinguish area;/ from the shape in 
v/hich" it occurs arid from lengths. / Students use a transparent acetate "ruler," 
composed of a strip of five "tile" units, to cover regions \ ; and compute and 
report area. . ' * V:v . '-v-:v. : ' ' - 

The test has been validated for the following subtasks: r ^. . 

1. Measuring the area of ^ rectangular or irregular shapes which have 
">i interior 1 cells arjd which can be covered by whole units. "Interior 

cells" refers to that surface area which, when covered by unit 
% "tiles, 11 does not lie along an edge. 

2. : Measuring the area of irregular shapes which have no interior 
a cells but which . must be covered in part by rectangular half-tiles. 

- :.. ' EXTENDED AREA ^Jy ' ' 

This test 'examines performance on the tasks of identifying, itieasuring, and 

reporting the area of a. variety of shapes. The student: uses a ten-centimeter 

ruler* to measure lengths, from which area can be computed. $he nonrectangular 

shapes on this test defy routine multiplications of "length times width": 
students must apply their understanding of the formula^ .. _ • 

The test has .been validated for the following subtasks: ^ 

1. Computing the area of a rect£n5gle whose dimensions musfr be 



140 



fiRTZ, TAYLOR & WILLIE ' 



21 



2. Computing the area of an irregular shape whose dimensions must tie 



"3i Computing the area Of a right triangle whose dimensions must be 
measured, and which is presented as half of a rectangle. - 

VOLUME*" TEST ' 

This test asks students to calculate ttje number of unit cubes' in a three- 
dimensional figure pictured in perspective. Students need to devise strategies^ 
other than unit counting in order to find the ntimbfer of "cubes" needed to 
construct each building pictured on the test. - 9 . >■ 

This test has been validated for the following subtasks: .* • ' 

li Finding the number of cubes in a ."regular" solid built from unit 
cubes. 

* i 1 CO „ 

2. Finding the number of unit cubes in an "irregular" solid. 



* 



Nami* 



Grado_ 



ate ... 
: 4 "eoha^ 



^I^ilhape has an. ue& of i tile. 

Est&flgura tiene un area de Iteja. 
(Midela con una regla de tejas.). 




What is the area of each of these shapes?^ 
iCudi es el area de catfa una <ie astas fitfuras? 

• . tiles 




Answer 



Howmariy? WHaT? 
6CuAnr&3? 6Que? 



Answer 
Bespuesta 



tile : 
Ma 



How many? WKBC? 



What is the. area of each of these shapes? 

6Cual es el drea de cada una de estas figuras? ! 





4g '* 



a: Answer 
Respjesia 



tiles 
tejas 



How many? /What? 
c^Quantos? 4?ue? 




• - - • tiles 
C, Answer ' tejas. 

Respuesia , 

v - >- How many? What? 

" - ' 6Cudritos?- 4Qu& 




. How jna&v? Whog.- 




copy rtgiaO 1979 - --- - 
educ*Uon devBiopment center, inc. 



d: 



Student CQpty. ..' ; r " ;; 
Gppia dei t estudlante 



k * ; tiles 

d< Answer to/as 
Respuesia ; — 



How many? What? 




tiles 

..a; Answer tejas 

Respuesta , 

• ... . - How many? What? 
> • 6Cudnvo$? 6Qu3> 




How many? What? 



i MBASUBEMENTjn nonntandard pies 
. Mk&lM VEAHEA, eu wjas qi*> tu Bttresuukhui- 



. >.... 



ERIC 



Ficjtire 5 *' v. 

T hree Perfrirriirmcd Catesor ie&Jx tiL C hiU t9QQ 7 
Whole Unit Subskill, Initial Area Test ; 



T 

L-r..:.,^jp< 



_ IS ■ 

S X ■ 

Pre. Aciiy. Post. 



Figure A ' s 

Three feerformafifce Fractions for Child #9007 

- . ] ..." ~ i . - - 

Whole. Unit Subskill, Initial 



ERIC 















8/4 




2/4 ; 




0/4 





Activity" Post tent 

f. 145 



figure 6 ... ' v f : 
Five Iii^' lidating Profiles That Give 
Pise to False Positives on Tests - "Invalid Vee" 



too 



:iic!cl: r .: 



Jul 



6a' 



6b 



1 



l i JL 



if £ 



;6c 



I' 
i 



ft 



9! 



I' 



St 



I 



6d 



T 

I 



to 



km 



6e 



Note': For the profiles 6a arid 6e, the perfonar.ee category for the 
is- different from the performance category for the post test. 



pretest 



to? 



.n» *• v- ^ 



146 

ERIC - 



•_. /■ , w,\ Figure *7 

Five Invalidating Profiles That Give 
Rise to False Negatives on Tests - ."Invalid Lambda' 



if f 



! 



T 



m 



i '-i-i.-tf 

fc^^f 



ST I 



-/ u 



; :7C 




7d 



" 



7e 



^•e: For profit 7S and. li, the .performance .oatejoryjor the 
li&tdxh |re:^ the perfeiance category for tne^pdsttest. 



c::e 



' tton 



Figure 3 

Seven .Profiles Which Show linprovir.f , 
- "Neutral tip" 



85 



8c 



8d' " 



8e 



T 



8f ; 



I 



8g' 



.f ■, 



,; ; " Figure 9 
Seven Profiles which Show Declining 
' ; , Performance '-, "Neutral down"; ' ; 



ierjc 



j . i. 



:. : 3b 



9c 



IT J 



9d 



- i< 



1 % 



i P 



i 

ft 
V 



PIT 



9f 



Figure 10 Validation "Cube 




ERIC 



Figure 12 ; Sliced View ..Of Validation Cube 




Figure 13. Flat Page Representation of v Validation Cube 



Post 




I 

I 




I 



Pre 



V 





Game Middle 



Post 



Post 



ERLC 







I 


N 

,1 , 


V 





Pre 
Game' Bottom 






Pre 




153 



Figure 14. . Validation Results for initial Area Test 



/ 



Whole Unit 
"covering" 



Half* Unit 
"covering" 



T 
M 
B 



T 
M 



. T 
M 



game 


top 




< ■ 


1 N 


1 N U4 VJ 


1 ■ • 


0 I 


0 1. 


2 N 


_'Post 


0 I 


0 I 


0 N 




B 


M . 


T 






Pre """ 






✓game 


middle 






*0 N 


ON 


0 I 




0 N 


<P) 


1 N 


Post 


V t 


0 N- 


0 N 




B 


Mv. 


m 






Pre 






-game 


bottom 






0 N 


0 1 


1 I 




2 N 


0 I . 


0 I 


Posi 




•1 N 


ON 










■ -if*-. ' 


B 


m f* 


:t ■ 





t Pre 

4 Aborted Valid Bottom 





game 


top 






.1 N 


2 N 


© 


M 


0 I 


0 I 


IN 


B 


n t 
U I 


0 I 


0 W 












B 




T 






Pre 






game 


middle 


T 


1 N 


1 N 


0 1 


M r 


0 N 1 


© 


ON 










B 


0 1 


*b n 


N 




B 


M 


T 






Pre 





T 
M 

B : 



game bottom 

V.' 

i n - o i i i 
o n - i s ^;p. r 

[35 VJ 0 N O N 




. v. 1 ' B M ". T g 

• Pre 
5 Aborted Valid Bottom 



ERLC 



i54 



1 



v * • v. 



21 Equivalency Results 
. for Half Units Subskill 
of initial Area Test 



// of errors on ; 
second test taken 



fill: 



4 
3 
S 

1 
0 



« 


— ' • j 1 ' I - 




i 


12 










1 




.. 1 


/7" 

[/ 










: Is : 






1 


3,' 









0" I S 3 4- 



# of errors on first test taken 



■ ... -J, 



.0 



0 



155 



ERIC 



Figure 22. Half Sample Validation Results for Initial Area fest 



game top 
^ 1(0) N 1(2) N fl4(10)Vj 
M I \ I ^2(2) N 



T 
B 



game top 
1(2) N 2(2) N 



I. 
I 




B 



M 
Pre 



f > 



B 



Pre 



i< game middle 
B 2(2) I N 




B (27(34)^ i(0) N 



B' 



Pre* 



I 

1(0)N 



1(0)1 

I 

f N 



T 
M 
B 



T 
M 
B 



game middle 
1(0)N 1(0) N 
^ / N (2{2) ~% 



M 
Pre 




B 



M 
Pre 



I 
N 
N 



1(0)1 

1. 

N 



9 

ERIC 




of Dichotombiis Performance \6f Sub tasks 
for Pretest - f Validating Activities, and Pbsttest 



Sul> tasks 



Pretest 



3f 



Validating Activities 



Postteat \ 



' Minutes to* J 5 

Minutes to 5 

Minutes to 1 - _ 
.;■ ...Hour- S c a 1 e '-'^ > v , -; : : 
r >L^ng»7Rel . '■ ; : V- r 

Initial Len & tk: ^ 

Starting Point 
Attribute 

Shorter than the Ruler 
LbngeV than the Ruler 
Non-integer 

Ejcj :■ ei ti x\ eA' 7, c ngt.li; v ' v "' 

Starring Point 
Attribute 

Shorter, than the Ruler 
fcringer than the Ruler 
Non-integer : 



Initia l Area : 

Whole Units 
. Half Units 



91£ 



91% 
62% 
96% 



93% 
93% 
100% 

89% 



97%' 



f*98% 
86% 



96% 
90% 



97% 
89% 

s m 

76% 

;96% 



85%-.A 
97%& 

>ioo| 

87% 
89% 



94% 
94% 
92% 
77% 
78%- 



94% 



*'92% 



89% ' i 

78% .v'. 
97% ~ 

•97%^''-^ 
98% 
92% 
93% 



• '..95% 
86% 
98%' 
98% 

/ 88% 



90% 
94% 



Extended Ai'tip: 

Rect angii lar Regions 
Ir rcgiila r Keg ions 
Triangular Regiontt . 



I 



Vol iiiii'": : 



Roc; t niigii 1 ft r So 1 '< ds 
Irregular Solid:? 



98%; - 
98% 
96% • ' 



86% 
85% 



100% 
98% 
96% 



89% . 



89% 



100% 
100% 
98% 



95% 
93% 



short hc-ii: i labels nbotl to ci.uo.-:>: 



ERLC 



j^"-'' hr Table 2 
Percentages "of Equivalent Performance^ 
^Fot Each Subskill on Each Test 



; Time^Telling: 



Minutes to 15 
MinureS to 5 
Minutes to v l r 
Ilour Scales i - 
Lang,/Relf 



Starting Point 

Shorter^haJv^ 
the ruler. . 

ipnger thaii ' 
the ruler 

.Non* integer 



100%' 
100% 
98% 

^4%~ 

100%' 



90% 
86% 

100% 

73% 



Initi&l Area ; 

Whole Units 
Half Units 



Rectangular Region's 



~" ^ Triangular Regions 



Volume: 

Rectangular 
: Irregular Solids 



1005 
1685 



92% 
100% 
100% 



89% 
95%- 



Extended Length : 



Starting Point 
. . ., ^-Attribute ^ 

■:'.,•:> ,"• Shqr.ter.. .than .;■ - 

". " v^: ; "^; ; /»the^ ruler " 
. £o n 8 e ^ than, . 

- ^fj,-. the^ruler 
Non- integer 



86% 
100% 



100% 



95%' 
'62% 



ERLC 



6?- 



Percentile^ of 
Validation Sample 



and a Rimdbiirte^ 



Subhkilts** 



% Valid for^. 
To tig 1 Sample; 



% Valid for 



Minutes Lb 15 

Minutes to 5 

; Minutes to 1 

TH o ii r S c: a t e • 
Lang. 



•.«f 



85 - 



97 
94 
85 
72 
97 



Initial 

^.v Starting Point 

Attribute ' ; . .. ^ 

Shorter thahfthe Ruler 
Longer than tfif Ruler 
Non-integer ' 



'',}:[ Extended Length : 

. "\ Starting Point 
'■' ; Attribute ; 

Shorter than the Ruler 
Longer tha:n the Ruler 
i' Non-integer * 



Initial Area: 

Whole AJnifcs 
.Ualf'Uaxts 



Ex tended Atfea:i 

Rectangular Regions 
Irregulat Reg l~oti%0 
Triangular Regions 



Volum e: 

llectangtilar Solids 
Irregular Solid?: 

- / 



88 $v 

lop 

98 ; f w 
94' > ^ 
88 



93 
>96 



92 
91 

I 



87 
100 
100 
94 
87 



i v 



V • 



100 1 
•83 
95 
83 

88 . 



ioo 

"it 78 
90 

<v. - 90 

93 



96 
96 



• ■ 98 
.. . 100 



, 96 
100 



■•••I 

*ioo 

*92 



r . 
o 

ERIC 



ISO 



i»erfbri:»aacc s PL:rceatiiSe for Subbkills 

g_Valid Top j 
To til # Valid) 



Time fi ling ; 

Miriutfes- tt> 15 
Minutes to 5 
Minutes to lv 
Hour Scale ^ 



Initial Length: 

*< Starting Point j , 
,,;^ltjtribute 
^;^5^rtgi* jthari the Ruler 
-ioixger ; than the Ruier 
'Ncfn~in : teger, 



Extended Leiigth: 

Starting Point 
Attribute ■ * 

^ ^Slvj^rtcr than the Ruler 
^.Lbager than the Ruler 
v , Nonr integer 



Initial Area: 

Wltole Units 
Half Units 



Extended- Argta : 

: Rectangular Regions 
Irregular Regions 
Triangular Regibn^v : 



Volume: 

Rectangular Solicb; 



91% 
76% 
72% 
65% 
85% 



80% 
• 98% 
100% 

66% 
. 63% 



89% 
100% 



,69% 



34% 



34% 
17% 
26% 



56% 
50% 




Table 5 > ; ' 

• ' . _ _• , ____ _ _ ■-_ ' _ _~ / . ~ _-_ _ " "■ ■' 

'. -ji ' Sensitivity of Validatipn Results ter Boundaries V • . 

o£ Pcfeformunce Categories for Extended Length Measurement Test 



is 



Top '''tmisct'dm 
Fr;ictions 



Validating 



Neutral 



Invalidating 



Identify 
Attribute 



1/5 

m 

1/3 



51 
54 
64 



7 
7 
3 



\:f- '"Starting 
Point 



1/5 
1/4 
1/3 



57 
,60 
60 



16 
10 
10 



Integer 

Length 

Shorter 



1/5. 
1/4 
1/3 



64 
65 
68 



„ .4 

, 1 



3.., 

3 

2 



Length ^ v 
Longer 


I 1/5 - 

1 1/4 

'-••.-9 j— ... . 


48 
50 

a 51 ;■■>'': 


• . ■ ' i 




Non-integer 


1/5 


50 


12 


: •' ' " :) 9 {: r, ; 


Length 


. 1/4 


51 


12 


' "' ' ■ ' .8 




.1/3 ,:■ 


55 


10 















ERIC 




CONCLUSION AND RECOMMENDATIONS < 

. • U'-Vv conventional educational tests, we\have argued, do not serve teaching 
and learning well,. . There is. little evidence that teachers use testing much to 
guide' instruction m there is 

Considerable public pressure to increase the amount of testing inT:he schools. 

•'• / ; v' ; . This widespread public call ^M^W^^S^-^': 1 ^^'^^^^ 
unusual Opportunity, we think, to begin to develop new assessment practices, 
more helpful to teachers .in the practice of teaching. Assessment, we have 
argued, should be. viewed, as an -integral part df^e^teach*ng „ .and ^™ing 
proeessvw If guidelines that should 

"be followed, in developing new assessment materials. 

ASSESSMENT THAT REELECTS THE CHARACTER OF TEACHING AND LEARNING ^ 

" • .' • •' ' ' . ■''■»'. '• ' . _^-'_:J:_- ± :_- ■■ . -. • ' : ■-, , ■ 

>.., : v? m general, .we believe the. preparation of test materials should begin 
knd^endin tee classroom, in close interaction with teacher* and students. If 
assessment materials are to serve instruction, they must be informed by an 
lirianding ■ S the ways in. which children learn, and demonstrate their 
SowlSge in the subject Ireas assessed.^ Too often, the only empirical work 
underlying -conventional standardized tests is a statistical analysis of test-= 
item score's^ performed at the end of the test -development, .process. And, for 
many objectives-based tests, no empirical work is done at all. • ^ ; ; . 

: ; We believe test development should ordinarily involve. three loosely- 
dPfir^d .stebs: open-ended observation of children and their work, the 
li^pent of sbSwhat more focused assessment, activities, and finally (in some- 
cases) -the development of formal assessment instruments. The preparation ot 
test materials should begin with careful observation ^ children, engaged in^the 
sorts, of learning tasks the' tests are designed to assess. Only^ by observing 
children and their work is it possible to identify the kinds of ..strengths and 
weaknesses children typicaly display in coming to terms with a subject area. 

! '".-J- l U : .. if observation is successful, • should lead to the development of. 
more focussed games, exercises/ and activities that embody the learning tasks, 
teing Ssessed. These -garnet and- activities - midway between open-ended 
observation and formal' tests - should elicit some of the patterns an^ 
regularities underlying children V work. By watching children completing _tbese 
- games and 'activities, , observes shquld be able to identify some of the 
Gompetencies>idividual children display, differentiate among typical errors, 
and interpret these' errors in terms of the trains of thought that might have 
" them."' - ' ■ ■' ■ ■ ; / 1 • '• '" ' ■ 



- ' . m some subject areas, we believe, semi-focused games and exercises 
may be . the most rigorous form of assessment desirable. Sometimes — - 
particularly - in the sciences, social studies, and. the arts? ; ~ there is no good 
rea^ to move from informal- exercises to' the development of formal tests (other 
than teacher-made tests) . in other areas - particularly reading, writing, and 
llmentary mathematics - formal, easily administered tests may have important 



CONGLUSION 2 _y . . > . '. : -fc'. " * * '^V V 

insbructional value, ■ . • ' - ' ■ . ■ * y 

s : ' when formal tests are developed, _we b#lieve they should be. basetp on 
extensive 1 work with children , based on iess.foririal ekercises and activities. \ 
OAy in this way is it, possible to assert with . any confidence that &e ; gonnal; 
tests :_aaequatfely represent the learning ' tasks being Assessed;- ^rthembrey we 
believe that when formal tests are developed/ they -should - not sfinpiy be scored 
in terms of questions , right and wrong. Instead/ test l items, should be desighe<3 
to- elicit coraponly;.^ errors, and the test scoring system should call; 

attention to the precise kinds r of errors each child has made, 

. * In most; cases, we believe, it is not possible to use multiple choice . 
questions ; to obtain the sorts : of error information v needed. . Qpen-\ended 
"constructed answer" questiqns permit students to jnake a wide variety of errors, 
and v this diversity is essential in attempting 'to jdetermine the source . of- student 
strengths §nd weaknesses . - •'• ~ * : " x ' • r -S. 

V'7- ; - 7 We have f octise^id^far ^ 

v^^' While w£*' believe new materials are , important, we believe it . is equally 

important to f ind .ways of helpi ng teachers improved their day-to-day skills in 
observing students and interpreting their work. r ; ,; , :^ * 

; ' ' cfie of ■' the ^centr^ j^^? in which : a teachet: ran guide a s student x s>- - 
- learning . : is- by gaining insight into how ja .child is thinking > in ja . partiollar 
-situation, krid, where the child might usefully move next. . 'Hie sqpsitivity > v and ■ 
skill Involved'in this sort of continuing assessment and diagnosis is difficult 
to acquire, and there is little research : to. indicate what sorts of trailing : 
programs might be successful; But- we believe additional- work in this ^rea'cqtild ■ 
■^^'■•:e^:r^00 important. . v-V\ • ' ' .\ ' 

techniques need to be ^deyeipped to 4 assist feeacher^ 4n : ; 
evaluating arid iiftei^ ycffk. ; - Student v ressay s , > 

art work, probl^^^^ - . 

information, much of Wh^ 
; help teacher s; . dravf diagnostic* i^i^i^^^^s^; Regular .clasrcom . work have tegiin; 
to appear, but more >frork is needed ^ ^ • . A..-. ''' v*' >'■'■■': • \ ... 

•.■-.v--;:.;.;.--^ ; • - -2 r v. ''A, • .. , -y ■■■ . 1, • 

_Fi.j^if^ "-w^ can' be ; learned by looking 'at .the 

development x>f children^ work' over f airjy ' long span? of time — longer, than a 
regular school year; Ways need to be found to collect systematic ^samples of 
student; work over time, so - that teacher s can jose the . work to uncover student 
strengths, gauge student progress, and discover continuing problems. Ihis 
approach to assessment ■ — sometimes called documentation ~ has t?een implemented 
by Patricia Carini. at the Prospect School in Vermont. r y ~ 1 

pssmsmvf fim ; ^iEers diversh^ : - . * ■ ; ; V ; 

Questi6ns about* whether * a child has mastered. a particular cognitive ^ 
skill are rarely' if ever answered once^^ A child who can compute the ' 

area of a geometric figure in one gorfi^^ v ;i P^ fai l to display the. . 

skill at all in another context^ A child who speaks fluently v ir> one Context may ; 
speak only in one r or two word sentences in another. And a dhild who.wip.tes i£ 
detail^ oh. one subject may write haltingly on another. /: ° ' . , X:.^-^0M^ 

• aildr en ^respond differently in different contexts partly ^ as a result 

of differences in interests and ta§;t£s; But partly," as we have argued;; these 



ERIC 



■-■■1 



ERIC 



contextual 
different cultures 
cognitive tasks. 



' . . ' ' ' _' _ ■ " . . .. - •• ••- 

irl ffior'e; profbtind. They arise_ because children of 
different stocks of knqwlfedge arid .experience to bear on 



if assessment Is to serve instruction^ . it must capitalize qn ^ 
among cultures and jpong children within cultures. Assessment 
materials should offer student ; ihultiple contexts in which^^feo demonstrate 
competence. Thus, for example, diagnostic materials in reading should as .a* 
matter of course include a variety of topics and styles? * and diagnostic 
materials in mathematics should .include problem sets in ^widely differing 
contexts. •.■ - ^ ' 

. Furthermore, we believe test materials' should include guidelines for 
teachers, indicating how cognitive tasks similar in structure to i those on the 
tests can be created using local contexts and materials. Often, we believe, 
' assessment can be strengthened by drawing on stories and topics from the local 
community, or even the classroom — including materials created * by children 
themselves. ?f*" 

■' ; . •' ■ • • .- • 4v-^i' ; - • ■ '•• 

. One of the principal elements of the practice df. teaching is choosing 
materials for each child that are likely to engage his skills and competencies. 
Skills developed in one context can then be strengthened and expanded, so that 
they can be applied in increasingly diverse and challenging settings. While the 
development >of diverse assessment materials can help in the process^ of 
identifying strengths and capitalizing upon them, the- ultimate success of this 
process depends on the sensitivity- and insight of the teacher. -Here, as before, 
we think that the development of'-new materials should be coupled with increased 
resources for in-service trailing. 5 Materials by themselves, while always 
necessary, are never sufficient. " • ; 



ASSESSr-ENT TOAT ENCOURAGES DIALOGUE ' . .. , ■•; ' 

Much attention in standardized educational testing; has gone into 
efforts to express test results as numerical' scores. But often, we believe, 
quantitative test scores hide as much as they reveal. Particularly . for purposes- 
of teaching and learning, we believe more can be gained by . looking at- student 
questions and answers themselves than by looking at; numerical summaries. v 

* we have argued that 'tests serve instruction by helping teachers 

interpret the thought processes underlying student work. A teachers 
interpretation of a student's work is always tentative and exploratory, and 
teachers can often gain insight by discussing, the work with the student; other 
teachers, and parents. Assessment materials can often provide particularly 
well- focused examples of student work, which- can serve as a foundation for tins 
sort of dialogue and discussion. ' >. " 

If assessment materials " are : to serve as -a-f edition for dialogue 
teachers^ students, and parents, : then studejat test forms must be 
lctuucu to students as soon as possible afterc the- tests/are completed. 
Generally, we believe this means that tests for instr.iic|ional 'purposes . must be 
marked by the teacher who administers them (or by fehfe*-^dentsN»tfionsei^esJ i'-'.It 
is extremely unlikely that tests which must be sent^bf| ^£0r ; . central ized scoring 
can be returned in time to serve instruction. ^ r - * 



between 



■-•>"■ 

p ••■«-.. .. j . ~ i -p. . .. - 

- . '1 '• ' ~" ■" • -"**-- "' Q. - ■V-#*^-*p- - , - -. ■ • - - - 

^INtn addition, we believe, parents . can and ihomd.GDntripute to the 



CONCLUSION : 



4 



assessment of their c&iidrerTs work; One way to - do this Is - to have teachers 
discuss ^assessment qu answers with parents. L Mso, parents shqiiLd be 

ehcpura^ed to of f er assessments of _ their ovn>_ derived. f rpm obsei^atipris of meir. 
cKJJL^ r ^ ri at home; : We recognize* ;bf course^: that parent involvpient in. ideation 
[i$M /goal frequently stated but difficult to achieve. Beaming closely involved 
in the education of their children^is of tehespeeially hard^f or working -parents; 

We believe that, weli-dfslgned assessment materials can^by pr<^lding_ clear ^ 
focussed examples of children's work, can improyejthe dialogue .between teachers, 
parents, "and students... ' J : f .V • •. $ ■ ' 



Finally, we * believe that assessment material^ c^; w be used to stimulate 
dialogue among , stii9eri£§u Inevitably/ ' as we h^e arg^^ 
approach assessment, task's in different ways, and tKiM^iyfirs^ 
resource for eXpldfation^ By encouraging st^ to 
a cognitive >; task , teachers can help students 'be^pg^|^s*e^? 6f alternative 
problem- solving strategies, their advantages and dls£^a^^e^;| "Dialogue, . theri, 
may helgv students increase '^eir rejastbire -x^^^i^^^JB^^ 

„. Dialogue among - students may also 
educational goal. By discussing some of the r 
assessment questions differ, students may become iri 
thought processes. Students may learn, when rottfrorli 

multiple potential solutions and how to assef a jt^~^^~^. ■ r — -^^w^v^ 
dialogue among students ^ay help than ^ 

questions and identifying strengths arid weaknes^s^' f^^K^V-'z ^ ' 




one additional 
2i?r\ answers 
?ie6txve about their 




REXT STEPS 



r * 



Most of tije ideas we have .proposed are riot}/ in 
ha^e a* long &ri§ .fioriorable History , in the; psychology and . £] 




osophy of 



new. 

education. 



" Rut they have not yet .. played a very strong^ vi^l§pi ; the development of . 
- c<fceational assessment. . v. ■■>*' ^^ &>i^yr' 

m ■ •••#•. < : /* ;- ' ■ : . ■ " • ■ • ■ 

The *; development pf alternative assessment practices, of the Sort we 
have descrited .will* not be easy or inexpensive, but^^j^lieve the < investment 
coull reap suBltantial rewards. We propose the folldwing strategies. 

First, Jit seems to us that, , in developing, new assessment materials, it | 

is ^worth start! is^. In our viewi ^jriapprbpriate to - attempt t9;X' ■' 

v ^construct, all at orice, tests that completely covert; a ^subject : area, such ag* 
.^/"elementary school mathematics pr junior high writing. ^ ;3^^ 

"y;-to carve out relatively^ small, ^ analysis of : 

v^'the cognitive tasks invdlved and close Qtipirical work with, children can be 

carried put. ■[■■■■■&■:>.'*. . ... - 



Even if this recommendation . is followed/ however, development costs 
*are likely -to be high, a fact made amply clear 1 by the experience of pro j ect 
"TORQUE. . ^ Moreover, the foundations thgt^u^orted the development of r an 
alternative" tblcurren^ assessment practice did not; support the implementation of 
that alternative. The situation at the time of this /writing is that these new 
methods ,and materials sit on a shelf waiting changes of heart, perspective and 
practfcjer on the part of publishers. We expect , that work in. reading and writing 
will^ be more, expensive and more: difficult and find even less enthusiastic 
support among foundations arid publishers; ' - 



ERLC 



166 



J -. T ' : ■•" v - m .' v" : '■• -; ■'V.4-/;" : ^; ^^r*^ • •- ' " _ . : . ' 

..x" ^ : J Ir ' Sedbhd^ we rec»inmtria t$3|^ new assessment laterals 

• • oqug^ wife a strong interest in the content areas 

"teing assessed* 'Thfse groups should be deeply involved in ail _aspects of the 
deyeLopnent 0 process — including observation of children, preparation of 
materia^, and validation, They might also be engaged in pilot efforts ^tq 
implqgierit the materials developed,- It is not sufficient to engage subject- \ 
matter specialists simply to rwi^w- test items once they are v/ritteri. Educators & 
with strong subject-matter J^^i^r^ds must be involved throughout. : 

":\ '" ■ j- Tfiiifcl, we/ redrame^^^^t .^hools i^erested in adopting hew f orSs of 

~ f> ; assessment "should beciji n by focusing bna small number of classrooms and subject 
areasv;' The temptation!^ to attempt to overhaul a. schools assessment 
. program in one f swift step f °bu^we Believe such an approach is ill-advised. 
;^'' : Im^OT^tiig- the Sorts of ideas we have proposed should be* an iterative process, 
^ in which new practices and organizational relationships are slowly developed. 

^ Finally, making the forms - of assessment we have si^jggsted work : j.n 

^ practice will depend: on the sensitivity and ingenuity, of teasers. It . is 
/ <>' W unreasonable to ask teachers to be wise and insightful /observers, of children and* 
their -work if the resources ' to support classroom teaching, are meager j and 
■>•; classrooms, overcrowded. The strategies we have proposd can not be implemented, 
at least in the short run; without extra resources % for in-service training and 
materials. In the long run, however, the ideas we have? gyoposed might riot cost 
substahfeially more, than present forms of testing, since many of , the material^ we 
- . have suggested would serve: ; both assessment and instruction. . . " v. 

* % , /There a large" and growing Remand > for improved educational 
, assessment in the^ classroom. ;= We are firm in our belief that appropriate 

- assessment practices are possible. Although the development of new, more useful ; ; 
v assessment materials will require an investment of resources, we believe this ^ 
investment is likely to have a profound and beneficial' effect^ on teaching. 
Indeed, as We have argued, successful fc teaching is in large measure a continuing 
process of inquiry. Md assessment. *• ; \ x . ■ v 




ERIC 



